Find last row of csv contents using fopen, fgetl?? Or, another way?

14 views (last 30 days)
Hello, I have a large csv ~350,000 rows, which contains a lot of information I don't need. The purpose of my script is to extract the columns I need, and to tidy up or convert it to a suitable format. Since the file is large, I read it in row by row. I run this script on a loop on multiple .csv files.
The problem I am having is how to terminate each loop when it reaches the end of the .csv row, when the last row of information is read. At the moment I am using two if/break/end commands to check if the subsequent row contains char or isempty. Using isempty() by itself was not enough as one time I ran the code, I got a non-useful char value (-1) in the last row (tline) for some reason and so the while loop kept going, even though there was no data to be read.
At the moment this code below is working but I get warnings every 1-2 .csv files that
Warning: 'Inputs must be character vectors, cell arrays of character vectors, or string arrays'
Is it possible to improve this code and make it more robust? Perhaps there is a clearer way to identify the last row of data? Each .csv is a different size and I have hundreds.
%% AIS_conv %%
% Code to coerce AIS to numeric array %
% MATLAB datenum | lat (dec deg) | lon (dec deg) | MMSI%
dd = 'input_data';
nowd = cd; %current folder
cd(dd); %go to input folder
d = dir('*.csv');
cd(nowd) %GO BACK TO current folder
for j = 1:length(d) %for every .csv file in folder
tic %(start timer)
filename = d(j).name; %get filename
disp(filename); %display filename
fid = fopen(fullfile(dd,filename)); %open file (saves time to read file
%line by line, as opposed to csvread)
tline = fgetl(fid); %read first line (column headers)
i=1; %relevant column of new vectors we are adding data to
%% for each csv file convert mmsi,lat,lon and datenum
while(1) %while in particular .csv file
tline = fgetl(fid); %read line
tline = strrep(tline, '"', ' '); %remove "
if(ischar(tline) == false) %terminate loop once last row of .csv is reached
break; %(two checks for this as one wasn't enough!)
c = strsplit(tline,','); %separate columns by ,
if( isempty(c{1}) == true ) %terminate loop once last row of .csv is reached
mmsi(i) = str2double(c{1}); %get mmsi
lon = str2double(c{3}); %convert lon to decimal degrees
londeg = floor(lon./100);
lonmin = (100*(lon./100 - floor(lon./100)));
londecdeg(i) = londeg + lonmin/60;
lat = str2double(c{2}); %convert lat to decimal degrees
lat = -lat;
latdeg = floor(lat./100);
latmin = 100*(lat./100 - floor(lat./100));
latdecdegnonvec = latdeg + latmin/60;
latdecdeg(i) = -latdecdegnonvec;
T(i) = datenum(c{11},'yyyy-mm-dd HH:MM:SS'); %get datetime
T(i) = NaN;
i=i+1; %move on to next row of output
disp('while loop error');
%% Create Output
out = [T' latdecdeg' londecdeg' mmsi']; %concatenate output
%Write output to file
month=strsplit(nowd,'\'); %get month to name specific file
outFileName = char(strcat('DATENUM_LAT_LON_MMSI_',month,'_',num2str(j),'.csv'));
disp(['ERROR!!!!!! ' filename])
toc %end timer

Accepted Answer

Stephen23 on 26 Aug 2020
Use feof like this:
while ~feof(fid);
tline = fgetl(fid);

More Answers (0)




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by