How can I read csv file data correctly? I tried multiple ways

2 visualizzazioni (ultimi 30 giorni)
Hi
I have a csv file and I am trying to import it to matlab. an example for the file content is the following (this is one row)
689d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,62
I tried using
readtable()
but the file is not separated as it is by the commas.
Then, I tried
csvread()
and I get the error:
Error using dlmread (line 147)
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 1, field number 1) ==>
Also, I tried
textscan()
and it did not do anything(no error but no content extracted either)
Last thing, I tried to manually import the data using the interface but the same problem in the readtable() occurred.
How can I read the data correctly? and put it in a matrix or table
Thank you

Risposta accettata

Stephen23
Stephen23 il 2 Nov 2020
Modificato: Stephen23 il 2 Nov 2020
textscan has no problems importing the file data simply and efficiently (sample file is attached):
opt = {'Delimiter',',','CollectOutput',true};
fmt = '%s%s%q%q%q%q%f%f%f%f';
[fid,msg] = fopen('temp0.txt','rt');
assert(fid>=3,msg)
out = textscan(fid,fmt,opt{:});
fclose(fid);
Giving:
>> out{1} % character data
ans =
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2'
[1x27 char] 'true' 'timestamp' ' 3' ' 9' ' 1'
[1x27 char] 'true' 'timestamp' ' 4' ' 10' ' 0'
[1x27 char] 'true' 'timestamp' ' 5' ' 11' ' -1'
>> out{2} % numeric data
ans =
25.2100 17.5366 39.0000 62.0000
25.2300 17.5366 40.0000 63.0000
25.2400 17.5366 41.0000 64.0000
25.2500 17.5366 42.0000 65.0000
>>
Because readtable also supports the format specifier I see no reason why it shouldn't work as well. I might try later.
  2 Commenti
Nora Khaled
Nora Khaled il 2 Nov 2020
Thank you for your help !
This code works with my problem very well.
But I would like to ask, why check assert(fid>=3,msg)? I thought fid contains the data from the csv file
Stephen23
Stephen23 il 3 Nov 2020
Modificato: Stephen23 il 3 Nov 2020
"But I would like to ask, why check assert(fid>=3,msg)?"
To print an informative error message if the file could not be opened.
"I thought fid contains the data from the csv file"
No, it does not.
The command fopen opens a file and returns a kind of handle to the open file, that handle is known as a "file identifier" (this is explained in the fopen documentation). Then any functions and operators which need to operate on that file (e.g. reading data, writing data, moving the current position in the file, etc.) are given that file ID so that they can perform their operations on the open file.
In this case textscan takes the file ID of an open file and imports the file data using the options that we defined.

Accedi per commentare.

Più risposte (1)

Mathieu NOE
Mathieu NOE il 2 Nov 2020
hello
seems matlab has an issue with the format of your data (especially with )
I could not make it work whatever the options with readtable.
I ended doing a small work around function with basic operations.
Seems to work, at least on my matlab
input data : 4 lines - slightly different - saved as csv file
689d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,62
679d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,63
669d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,64
659d-40bf-9c61-551c0c1a69bf,true,"timestamp"," 2"," 8"," 2",25.21,17.536593101926,39,65
function code as follows :
function output_matrix = retrieve_csv(Filename)
fid = fopen(Filename);
tline = fgetl(fid);
k = 0;
while ischar(tline)
k = k+1; % loop over line index
sep = findstr(tline,',');
ind = [0;sep(:);length(tline)+1];
for ci = 1:length(ind)-1
tline_extract = tline(ind(ci)+1:ind(ci+1)-1);
% remove undesired characters (")
ind_rem = findstr(tline_extract,'"');
tline_extract(ind_rem) = '';
output_matrix{k,ci} = tline_extract;
end
tline = fgetl(fid);
end
fclose(fid);
output :
output_matrix =
Columns 1 through 7
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
[1x27 char] 'true' 'timestamp' ' 2' ' 8' ' 2' '25.21'
Columns 8 through 10
'17.536593101926' '39' '62'
'17.536593101926' '39' '63'
'17.536593101926' '39' '64'
'17.536593101926' '39' '65'
  2 Commenti
Nora Khaled
Nora Khaled il 2 Nov 2020
Thank you very much!
I was wondering how am gonna read the file if no reading function worked, so, this is really helpful.
you can also see Stephen answer, he used the function textscan() with a format specifier.
Mathieu NOE
Mathieu NOE il 3 Nov 2020
yes - I myself learn from Stephen answer ! good for me too !

Accedi per commentare.

Categorie

Scopri di più su Large Files and Big Data in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by