readtable cannot handle double quotation marks very well

5 visualizzazioni (ultimi 30 giorni)
I have CSV files saved with LibreOffice with text flanked by double quotation marks (Format quoted field as text).
When I tried to read one of such CSV with two rows with readtable,
T0 = readtable('file1.csv',...
'Encoding','UTF-8','delimiter',',','ReadVariableNames',true);
readtable failed to read the first row,
Then I used this command and it can read both rows.
opts1 = delimitedTextImportOptions('Encoding','UTF-8','Delimiter',',','DataLines',[2 Inf],'VariableNamesLine',1);
T1 = readtable('file1.csv',opts1);
However, the content of table wasn't great:
ans = 2×1 cell
'"optotagging"'
'"behaviour"'
The double quotation marks remained in some columns.
setvaropts' option 'QuoteRule','remove' appeared to be promissing, but I could not get it work.
setvaropts(opts1,'QuoteRule','remove')
How do I nicely remove double quotation marks in CSVs?

Risposte (1)

Kouichi C. Nakamura
Kouichi C. Nakamura il 6 Gen 2021
Modificato: Kouichi C. Nakamura il 7 Gen 2021
I asked this to Mathworks and their answer was helpful:
opts = detectImportOptions('file1.csv','NumHeaderLines',0,'Delimiter',',') %will almost work for this case, but it detects the first line as a "meta-data" line because it is all string/blank
opts.DataLines = [2,inf] %will work around that issue
T2 = readtable('file1.csv',opts);
With this code, I can read both rows and remove double quotation marks nicely.
According to Mathworks:
> The solution shared, is very specific to your workflow and is an undocumented method which might change without notice.

Categorie

Scopri di più su Tables in Help Center e File Exchange

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by