How to put (tab delimited) text files together removing header text?

2 visualizzazioni (ultimi 30 giorni)
Hi, I have many text files in the following format:
Name of the file
Date
Other useless data
Column1 [unit] Column2 [unit] Column3 [unit] Column4 [unit] ...
0.025 6.8 9.4 9.5 ...
0.050 2.8 4.4 4.2 ...
0.075 3.3 7.4 6.1 ...
...
I would like to copy all the data from all the files into a single file. I am familiar with the command:
!copy a.txt+b.txt ab.txt
However, I would like to remove all header lines and have only the numerical data in the new file (and then put a new header line in the first row so that 'tdfread' can read it easily). I would like my output file to look like this
MyHeader1 MyHeader2 MyHeader3 MyHeader4 ...
0.025 6.8 9.4 9.5 ...
0.050 2.8 4.4 4.2 ...
0.075 3.3 7.4 6.1 ...
...
Another challenge is that there are several thousands of files, so I would need an automated procedure to read the files after one another. Or alternatively, a way to select all the files in a folder to concatenate. Unfortunately they are not conveniently named so I cannot construct the file names in a for loop for example. Any help is very much appreciated.
  1 Commento
dpb
dpb il 11 Nov 2013
Are the number of header lines in each file the same?
As for the obtaining all files in a subdirectory,
d=dir('*.txt');
and then iterate over d.name
This should be basically trivial if the headerlines are consistent; a little bit of a pain otherwise.

Accedi per commentare.

Risposta accettata

dpb
dpb il 16 Nov 2013
Modificato: dpb il 16 Nov 2013
So, the answer is the same as originally given, then...use sotoo
fmto=['%12.3f' repmat('%12.3f',1,nCols-1)];
fido=fopen(youroutputfilename,'w');
fprintf(fido,'%s\n', yourheadertext)
for j=1:length(fileList)
fid = fopen(fileList(j).name,'r');
d=cell2mat(textscan(fid,'%f','headerlines', 6, 'treatasempty',{'n/a';'N/A'}));
fid=fclose(fid);
fprintf(fido,fmto,d')
end
fido=fclose(fido);
Adjust the various parameters to suit.
doc textscan % and friends
for more detail on the various options for empty values, and
doc fprintf % etc.
for detail of format strings to match you desired output formats. With a regular file format it is really pretty straightforward. The other respondents use of save is somewhat less verbose at the cost of less control over the output format--your choice depending on wants/needs.
ERRATUM:
Forgot the \n character for the output format...
fmto=['%12.3f' repmat('%12.3f',1,nCols-1) '\n'];
Also if do want the tab-delimited form retained then need it as well...
fmto=['%12.3f' repmat('\t%12.3f',1,nCols-1) '\n'];

Più risposte (3)

G A
G A il 12 Nov 2013
you can use this algorithm:
fid1=fopen('fileName1','w');%open output file to write headers
fprintf(fid1,formatSpec,H1,H2,Hn);%write headers into the file
fid2=fopen('fileName2');%open file with the data
A = fscanf(fid2, '%f');%read from file numerical data only
fclose(fid2);
save(fid1,'-ascii','-tabs','-append','A');%append data to the output file
  2 Commenti
dpb
dpb il 12 Nov 2013
fid2=fopen('fileName2');%open file with the data
A = fscanf(fid2, '%f');%read from file numerical data only
The above will fail for these files w/ the header lines...
László Arany
László Arany il 12 Nov 2013
Also, I forgot to mention that the files are measurement results and each column contains data from different sensors. Now when some of the sensors were damaged/unreachable/off then its column has n/a or N/A or NaN (etc). Therefore, I probably cannot read them as '%f', I was trying to read them as strings.

Accedi per commentare.


László Arany
László Arany il 12 Nov 2013
In the meanwhile I managed to sort out the last part of the question. This is a simple way to reach all files from a folder and then open them in a loop:
fileList = dir(PathToFolder);
fileList = fileList(~[fileList.isdir]); %remove folders inside target folder
L = length(fileList);
for j=1:L
fid = fopen(fileList(j).name,'r');
... % operations using the file
fclose(fid)
end
  2 Commenti
László Arany
László Arany il 16 Nov 2013
Hi dpb,
the number of lines to remove is the same for all files. Sorry, I did not know there are comments and answers separately, and I did not see your comment, that is why I wrote this. I added the new info to the question.
dpb
dpb il 16 Nov 2013
OK...I'll go back and delete previous and then you can clean up the unnecessary comments leaving only a clean response in database going forward for somebody else's later use, perhaps. At least that's the hope in Answers--how much different it is in reality than a conventional newsgroup in that regard I've my doubts...

Accedi per commentare.


Alex Z.
Alex Z. il 15 Giu 2017
This can be done in EasyMorph using Append transformation. The tool is free.

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by