Import of tables from R where the first line describing the column names is one element shorter

5 visualizzazioni (ultimi 30 giorni)
Many files exported from R looks something like this:
"var1" \t "var2"
"row1" \t val1 \t val2
"row2" \t val2 \t val2
The problem is that the line describing the variables is one element shorter, which readtable doesn't like much. Is there any way I can make that work? Editing the input file by changing first row to
\t "var1" \t "var2"
fixes the problem
I'm trying to read it with the line
f = readtable(filename, 'ReadVariableNames',true, 'ReadRowNames', true, 'Delimiter', '\t');
This should be a standard thing, but I just cannot make it work. I don't want to have to edit the input files all the time?

Risposta accettata

Guillaume
Guillaume il 15 Set 2019
Modificato: Guillaume il 15 Set 2019
Yes, readtable expects the variable name line to have a placeholder (DimensionName) for the row name column. I suggest you raise an enhancement request with Mathworks.
Here is a roundabout way to get it to work:
%1st grap the variable names. Matlab should add an extra variable name at the end of the list to match the number of data columns
%ignore row names for now
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1)
varnames = opts.VariableNames;
%then tell matlab that there are row names. That messes up the variable names. So get these from the previous opts
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1, 'ReadRowNames', true)
opts.varnames = ['RowNames', varnames(1:end-1)]; %Still need a name for the row names columns.
opts.Datalines = [2, Inf]; %that's also messed up
result = readtable(yourfile, opts)
It works on the file I've tested but because of the complex heuristics of detectImportOptions it may break on more complex files.
Tested on 2019b. Not sure how it behaves with 2016b where detectImportOptions may not be as sophisticated.

Più risposte (1)

Johan Gustafsson
Johan Gustafsson il 15 Set 2019
Thanks, however this does not work for me, I suspect that the ReadVariableNames property is something that comes with 2019b, is that so? I tried to upgrade to 2018b, but it didn't help. I get the following error:
Error using detectImportOptions
'ReadVariableNames' is not a recognized parameter. For a list of valid name-value pair arguments, see the
documentation for detectImportOptions.
Is there another trick I could use? I was thinking I could do something using fgetl and a regexp, but it is kind of a messy way to do it?

Categorie

Scopri di più su Tables in Help Center e File Exchange

Prodotti


Release

R2016b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by