MATLAB Answers

I can't read my .txt file using importdata. Problem with format?

81 views (last 30 days)
Ajpaezm
Ajpaezm on 11 Aug 2016
Commented: Ajpaezm on 12 Aug 2016
Hello,
I made a script that reads a .txt file using textscan to manipulate some of the values, the outcome is another .txt file in the same format. Somehow, Matlab cannot read it, even though I produced it through the script.
These are my two files.
I don't know if this is relevant, but the parts that read the inputs and store the changed variables in my code are in these two pieces.
%for the input
A=importdata('subject 3.txt');
Only_Data=cell(size(A(27:end)));
[a,b]=size(A(27:end));
C=A(27:end);
y={A(1:26)};
For the outcome:
new_subject3=[new_subject3];
outfile='backMatch_VS_1BM_1412015_NS-003-1-updated';
filename=[outfile,'.txt'];
fid=fopen(filename,'w');
yNew=y{1};
fprintf(fid,'%s\n',yNew{:}); %this line prints the headers exactly as they were.
%we need this loop for a exact format
[nrows,ncols] = size(new_subject3);
for row = 2:nrows
if row<11
fprintf(fid ,'%s %7s %13s %13s %18s %13s %8s %12s %12s %6s %4s %7s\n', new_subject3{row}{1},new_subject3{row}{2},new_subject3{row}{3},new_subject3{row}{4},new_subject3{row}{5},new_subject3{row}{6},new_subject3{row}{7},new_subject3{row}{8},new_subject3{row}{9},new_subject3{row}{10},new_subject3{row}{11}, new_subject3{row}{12});
elseif (row<=100)&&(row>=11)
fprintf(fid ,'%s %6s %13s %13s %18s %13s %8s %12s %12s %6s %4s %7s\n', new_subject3{row}{1},new_subject3{row}{2},new_subject3{row}{3},new_subject3{row}{4},new_subject3{row}{5},new_subject3{row}{6},new_subject3{row}{7},new_subject3{row}{8},new_subject3{row}{9},new_subject3{row}{10},new_subject3{row}{11}, new_subject3{row}{12});
elseif (row<=nrows)
fprintf(fid ,'%s %5s %13s %13s %18s %13s %8s %12s %12s %6s %4s %7s\n', new_subject3{row}{1},new_subject3{row}{2},new_subject3{row}{3},new_subject3{row}{4},new_subject3{row}{5},new_subject3{row}{6},new_subject3{row}{7},new_subject3{row}{8},new_subject3{row}{9},new_subject3{row}{10},new_subject3{row}{11}, new_subject3{row}{12});
end
end
fclose(fid);
When I try to read my outcome, I can't do it.
A_NEW=importdata('backMatch_VS_1BM_1412015_NS-003-1-updated.txt');
Error using importdata (line 225)
Unable to load file.
Use TEXTSCAN or FREAD for more complex formats.
Caused by:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
According to Notepad++, my original file is shorter (59807) than my new one (67157).
Any clues on this? Is there a way to change my file to an acceptable format by matlab? How can I work this around?
Thanks in advance.

  3 Comments

per isakson
per isakson on 11 Aug 2016
Why different encoding?
&nbsp
It's easier to help if you upload a text-file so that we can try it.
"importdata (line 225)" &nbsp doesn't make sense with R2016a. Which release do you use?
dpb
dpb on 11 Aug 2016
Well, as Per says, we can't test what we don't have but the error message is a klew...
Error using importdata (line 225)
Unable to load file.
...
Caused by:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
importdata has to infer the columns in the file from the data it finds; the above says it read some, then some more and discovered that vertcat didn't work. Since it's vertcat, that implies there weren't the same number of columns found on the second try as were determined to be thought to be in the file at first.
I've never actually used importdata so I don't really know it, but I'd be checking to see if there are actually the same number of columns written every row in the data portion, the number of delimiters matches in any blank rows that aren't totally blank, etc., etc., etc., ...
The file somehow isn't as regular as you think and we can only see what it looks like in Notepad which is notoriously bad for trying to see what's actually in a file...
Ajpaezm
Ajpaezm on 12 Aug 2016
Hi per and dpb, thanks for answering.
This is the file from which I got my data (subject 3) and my outcome (the one with the long name). If I can read my input file correctly with importdata, why I can't read the outcome the same way?
Should I try another way of reading this file?
The version of Matlab I'm using is 2015a. And yes, the file is not regular at all.
The problem is with the first 7 lines of the file. I tried printing first these combinations:
1) just the data in the columns ( dataimport in my outcome file worked nicely).
2) the data with the headers ( dataimport in my outcome file worked nicely)
3)The first six lines from my input file and the rest of the table modified ( dataimport in my outcome file worked nicely).
4) When I added a 7th line, is where dataimport stopped working on my .txt outcome file.
I can still print the whole thing down, and give it some format. But I should be able to read my outcome.
The reason for this is that I may have the need of editing my outcome file in the future. And if I can't use dataimport, I'd have to look for another way of reading it.
I believe I should find another way of reading the input file and then change the data I need to change. Can you give me some suggestions/examples on how you would do it?
Thanks a lot for your time!

Sign in to comment.

Accepted Answer

dpb
dpb on 12 Aug 2016
OK, with the files you can do a file compare and see the difference--
C:\ML_R2012b\work\updated.txt - c:\ML_R2012b\work\subject 3.txt
...
inary) 69 6E 61 72 79 29 20 20 - inary) 69 6E 61 72 79 29 20 20
(binary 20 28 62 69 6E 61 72 79 - (binary 20 28 62 69 6E 61 72 79
)...1 29 0A 0A 0A 31 20 20 20 - )...1 29 0A 0A 0A 31 20 20 20
0 20 20 20 20 30 20 20 20 - . 0 20 20 09 30 20 20 20
20 20 20 20 20 20 20 - . 20 20 09 20 20 20 20
-88 20 2D 38 38 20 20 20 - -88. 2D 38 38 09 20 20 20
20 20 20 20 20 20 20 20 - 20 20 20 20 20 20 20 20
4 20 20 34 - SESS. 20 53 45 53 53 09 20
20 20 20 20 20 20 20 20 - 20 20 20 20 20
0.0 20 20 20 20 20 30 2E 30 - 0.0 30 2E 30
240 32 34 30 20 20 20 20 - 240. 32 34 30 09 20 20 20 20
0. 20 20 20 20 20 20 30 2E - 0. 20 20 30 2E
00 30 30 20 20 20 20 20 - 00. 30 30 09 20 20 20 20
0.0 20 30 2E 30 20 20 20 - 0.0. 30 2E 30 09 20 20 20
000000 20 20 30 30 30 30 30 30 - 000000 20 30 30 30 30 30 30
00 0 30 30 20 20 20 20 20 30 - 00 0 30 30 20 20 20 20 20 30
0000000 30 30 30 30 30 30 30 - 0000000. 30 30 30 30 30 30 30 09
1 20 20 20 20 20 20 31 20 - 1. 20 20 20 31 09
[ 20 20 20 5B 20 20 20 20 - [ 5B 20 20 20 20
].2 20 20 20 5D 0A 32 20 20 - ].2 20 20 20 5D 0A 32 20 20
0 20 20 20 20 20 30 20 20 - . 0 20 20 20 09 30 20 20
The subject 3 file is tab-delimited; your new file isn't; note after each field beginning with the first data line (the number one, "31" decimal in the byte-comparision) there's a blank (20) in your new file but a tab (09) in the original. If you keep comparing field by field you'll see that pattern holds but you wrote no tabs in your output file; it's space-delimited. Somewhere in there, since C (and hence Matlab which uses the C i/o library) formatted input treats blank space as if it essentially isn't there, somewhere that leads to a field being "smushed" together and missing a column.
Moral: Write your new file tab-delimited so there is a recognizable delimiter between fields and then all should be well...

  4 Comments

Show 1 older comment
Ajpaezm
Ajpaezm on 12 Aug 2016
Fantastic suggestion dpb, it worked nicely.
I added '\t' to the fields of my printing loop. It actually saved me some lines of code. If you compare it, you'll see it.
fid=fopen(filename,'wt')
fprintf(fid,'%s\n',A{1:26});
[nrows,ncols] = size(new_subject3);
for row = 2:nrows
fprintf(fid ,'%s\t%5s\t%11s\t%17s\t%12s\t%10s\t%7s\t%12s\t%9s\t%4s\t%s\t%5s\n', new_subject3{row}{1},new_subject3{row}{2},new_subject3{row}{3},new_subject3{row}{4},new_subject3{row}{5},new_subject3{row}{6},new_subject3{row}{7},new_subject3{row}{8},new_subject3{row}{9},new_subject3{row}{10},new_subject3{row}{11}, new_subject3{row}{12});
end
fclose(fid);
One more question. What did you use to get the byte-comparison of the files? That could help me in the future.
I'll try Stephen's approach too. Thanks!
dpb
dpb on 12 Aug 2016
That's actually the Matlab file comparison tool in the Editor...it's not the best but is adequate and is convenient if using Matlab and don't have another favorite already...
Yes, I wondered why you had the multiple cases for a fixed-width file but hadn't dug into the actual coding enough to try to "finger it out". For Stephen's suggestion, all you have to do is substitute the chosen delimiter character into the format string in lieu of the tab \t pair. Whether that causes "issues" later depends on whether there's any other use for reading the file by something else; I would presume not but just making note...
Ajpaezm
Ajpaezm on 12 Aug 2016
I have a lot to learn still, this was the first time I had to print a file with this "unusual" structure. Now I can work my way around it.

Sign in to comment.

More Answers (0)


Translated by