Get every nth row of a tall array

Question

0 voti

I have a tall array and would like to collect every 26th row of one variable into an array. I tried:

U = tall(udata);
hhws = [];
udata.ReadSize = 26*500; % data is in 26 row chunks, so sizing so below works
while hasdata(udata)
    U = read(udata);
    hhws = [hhws;U.Var13(14:26:end)]; % want every 26th row starting with the 14th row
end

This produced the error:

Error using matlab.io.datastore.TabularTextDatastore/readData (line 78)

Unable to parse a "Numeric" field when reading row 10765, field 1.

Actual Text: "******** 7.909"

Expected: A number or literal "NaN", "Inf". (possibly signed, case insensitive)

Error in matlab.io.datastore.TabularDatastore/read (line 174)

[t, info] = ds.readData();

Caused by:

Reading the variable name 'Var1' using format '%f' from file: '<file path and file name>' starting at offset 1011702139.

Seems like maybe there's a problem with how I'm reading the file in? Is the method above viable assuming I get through this error? Thanks!

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

dpb il 18 Ago 2022

Modificato: dpb il 20 Ago 2022

Apri in MATLAB Online

0 voti

Actual Text: "******** 7.909"

The problem is in the data file itself -- there's an oveflow field indicator of "*" in a numeric field that fails because can't be converted to a numeric value by a formatted read.

You would need to add

'TreatAsMissing',{'********',''}

to the datastore when create it.

I've not really used the datastore much; I didn't see it there, but with detectImportOptions and the resulting text import object, there's also an 'ImportErrorRule' parameter that can be used to substitute a 'FillValue' which in that case could be made to return inf instead of nan to identify the specific instances as being the overflow and leave the missing just as empty. Seems an oversight unless I just missed it in the doc, but surely didn't find it; the options available aren't as extensive for the datastore, it seems.

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Dan Houck il 22 Ago 2022

Apri in MATLAB Online

Sample.txt

Actually, I suspect it's more complicated...

If I only add the recommended 'TreatAsMissing' to the 'tabularTextDatastore' command, I get this error: Cannot interpret data in the file '<file>'. Found 2 variable names but 25 data columns. You may need to specify a different format, delimiter, or number of header lines.

This suggests that 'TreatAsMissing' is changing how it reads in some of the header lines and I need a different number of headerlines, I think. I've tried a bunch of different numbers. Most of the rest produce this error at the read function: The value for "TreatAsEmpty" must be non-empty character vectors or cell arrays of character vectors.

So what does this mean?

I attached a file that's similar to the one I'm working with. The main difference in format is that mine has 25 columns of data in 25 rows "chunks" and the sample has 15 of each. The header lines should be the same. Each chunk of data starts with a line with only two variables and I want the one in the second column from each chunk. My latest code is below:

udata = tabularTextDatastore('Path\Sample.txt','FileExtensions','.txt','NumHeaderLines',13,'TreatAsMissing',{'********',''});
hhws = [];
time = [];
count = [];
udata.ReadSize = 20000;
while hasdata(udata)
    Ut = read(udata);
    ts = isnan(Ut.Var3); % the blank entries are read in as NaN, so I'm using those to find this line in each chunk
    time = [time;Ut.Var1(ts)];
    hhws = [hhws;Ut.Var2(ts)];
    count(end+1) = length(Ut.Var1);
end

Thanks for your continued help!

Dan Houck il 22 Ago 2022

Got it to work! Just had to change 'TreatAsMissing',{'********',''} to just 'TreatAsMissing','********', though I don't understand why that made the difference.

dpb il 22 Ago 2022

That does seem peculiar; the empty record is default; it's supposed to use either.

That might be worth a support Q? to TMW to ask if that is an expected result.

Accedi per commentare.

Get every nth row of a tall array

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Più risposte (0)

Categorie

Prodotti

Release

Tag

Community Treasure Hunt

Get every nth row of a tall array

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

4 Commenti Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Più risposte (0)

Categorie

Prodotti

Release

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti