count number of rows in csv outside of matlab
Mostra commenti meno recenti
I have 10000+ csv files I would like to import into matlab. I only need the data from the first and last rows for inlet and exit conditions. Each csv file has a different number of data points, so I do not know the length of the file imported a priori. I am trying to automate the import process. I can automate importing all the data or specific lines, but I do not know how to import the last row. The only way I can think of is to determine the number of rows in the file without importing the data (importing all the data takes a few hours) and import that row specifically. Does anyone know how I can do this? I have tried messing with textscan, but I have not had any luck.
5 Commenti
Walter Roberson
il 26 Feb 2021
Which operating system?
Also, when you say "first" row, is there a need to skip a header row?
Alexandra McClernon Ownbey
il 26 Feb 2021
Walter Roberson
il 26 Feb 2021
yes, I am asking if you are using windows or Linux or Mac.
Walter Roberson
il 26 Feb 2021
Is there an upper limit on the number of characters per line in your csv files? For example are the lines more than 1 kilobyte each?
Alexandra McClernon Ownbey
il 26 Feb 2021
Risposta accettata
Più risposte (2)
KSSV
il 26 Feb 2021
csvFiles = dir('*.csv') ;
N = length(csvFiles) ;
f = cell(N,1) ; % first row
l = cell(N,1) ; % last row
for i = 1:N
data = csvread(csvFiles(i).name) ;
f{i} = data(1,:) ;
l{i} = data(end,:) ;
end
1 Commento
Alexandra McClernon Ownbey
il 26 Feb 2021
Modificato: Alexandra McClernon Ownbey
il 26 Feb 2021
Walter Roberson
il 26 Feb 2021
csvdir = 'appropriate_directory_name'; %use '.' for current directory
csvFiles = dir(fullfile(csvdir, '*.csv'));
filenames = fullfile({csvdir.folder}, {csvdir.name});
N = length(csvFiles) ;
f = cell(N,1) ; % first row
l = cell(N,1) ; % last row
for K = 1:N
thisfile = filenames{K};
[fid, msg] = fopen(thisfile, 'r');
if fid < 0
fprintf('failed to open file "%s" because "%s", ignoring it\n', thisfile, msg);
next
end
fgetl(fid); %skip header
f{i} = cell2mat(textscan(fgetl(fid), '')); %first line
%data is 8 columns. We can be sure that columns are < 25 characters each
fseek(fid, 256, 'eof'); %move to near end of file
fgetl(fid); %we positioned to middle of line, discard to end of line
%look for the last non-empty line
old_line = '';
while ~feof(fid)
new_line = fgetl(fid);
if ~ischar(new_line); break; end %EOF
if ~isempty(strtrim(new_line))
old_line = new_line;
end
end
fclose(fid)
l{i} = cell2mat(textscan(old_line, ''))
end
What this code is doing is opening each file, skipping a header line, reading the next line and converting it to numeric. Then it seeks to before the end of file and reads lines, discarding empty lines, including empty lines that occur at end of file, keeping the last non-empty line it finds, and converting the last non-empty line to numeric.
The code seeks to 256 characters before the end of file, skipping the rest of the file -- literally not reading it as much as is possible with the operating system. Why 256? Because it is a "nice round number" to computer scientists ;-) If the data was output as double precision, then it could take as many as 25 characters per entry such as '-6.32359246225409463e+110' plus the comma delimiter, maybe a space as well, so possibly 27*8+2 characters = 218 characters for the line. Using 256 gives a bit of slack in case we miscounted or there is something odd in the file.
Categorie
Scopri di più su Signal Processing Toolbox in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!