Index table with unrecognized row names

4 visualizzazioni (ultimi 30 giorni)
Adria Rofes
Adria Rofes il 26 Ago 2017
Commentato: Steven Lord il 5 Set 2017
I have a table "A" with name values
A =
Var1 Var2 Var3
____ ______ _____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
abalone 1 -1 12.23
abandon 1 395 8.32
I can index rows by name
A = readtable('A.txt','ReadRowNames', true)
B = A({'a', 'aardvark', 'abacus'},:)
B =
Var1 Var2 Var3
____ ____ ____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
However, how can I make it so that when I search for names that are not in the matrix I get a table with empty values?
B = A({'a', 'aardvark', 'abacus'. 'invented', 'inventedtoo'},:)
B =
Var1 Var2 Var3
____ ____ ____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
invented
inventedtoo
Many thanks!
  1 Commento
Stephen23
Stephen23 il 5 Set 2017
adria rofes's "Answer" moved here:
I went around with a code that deletes the values that are not in the matrix.
Words2 = setdiff(A.Properties.RowNames, B); %Words in A not in B
Words3 = setdiff(B, A.Properties.RowNames); %Words in B not in A
Words4 = setdiff(A.Properties.RowNames, Words2); %Words in A (that I can index)
Words5 = setdiff(Words3, A.Properties.RowNames); %Words not in A (I cannot index)
Result = Database(Words4,:); %Table with words I can index
...However, I am not 100% happy. Because this deletes instances where I may have input two times the same word and also all instances where words are not in the database, where I wanted to have empty values.
This is what I get...
Var1 Var2 Var3
____ ____ ____
a 1 217 2.89
And this is what I would have wanted to get...
Var1 Var2 Var3
____ ____ ____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
invented NaN NaN NaN
inventedtoo NaN NaN NaN
a 1 217 2.89

Accedi per commentare.

Risposte (3)

Kristen Amaddio
Kristen Amaddio il 29 Ago 2017
You might want to create your own function in order to achieve this behavior.
Here are some basic first steps to consider:
1. You can use the 'lasterr' function to figure out the error message generated when you try to index a table with unrecognized rownames:
>> B = A({'a', 'aardvark', 'abacus', 'invented'},:)
Unrecognized row name 'invented'.
>> [msgstr, msgid] = lasterr
msgstr =
'Unrecognized row name 'invented'.'
msgid =
'MATLAB:table:UnrecognizedRowName'
This tells you that the ID of the particular error message is 'MATLAB:table:UnrecognizedRowName'.
2. You can then use 'try' and 'catch' statements within this function in order to execute the desired code when the specified error is generated:
try
B = A({'a', 'aardvark', 'abacus', 'invented'},:)
catch ME
% If the error thrown is the unrecognized row name
if (strcmp(ME.identifier, 'MATLAB:table:UnrecognizedRowName'))
% DO SOMETHING HERE
% TO ACHIEVE DESIRED BEHAVIOR
end
end
3. Now, within that 'if' statement, you would add some lines of code that would add these empty rows to the table. One simple option would be concatenating the existing table with a row of NaNs. You can specify the 'VariableNames' and 'RowNames' properties to reflect the desired values:
B = [A; array2table(nan(1,3),'VariableNames', A.Properties.VariableNames,'RowNames',{'invented'})]
Now you should see the following output:
B =
4×3 table
Var1 Var2 Var3
____ ____ ____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
invented NaN NaN NaN
Now that you have a starting point for this function, you can work on generalizing it to fit your use case.
  1 Commento
Adria Rofes
Adria Rofes il 29 Ago 2017
Thanks a million, Kristen.
With this code:
try
B = A({'a', 'aardvark', 'abacus', 'invented'},:)
catch ME
% If the error thrown is the unrecognized row name
if (strcmp(ME.identifier, 'MATLAB:table:UnrecognizedRowName'))
B = [A; array2table(nan(1,3),'VariableNames', A.Properties.VariableNames,'RowNames',{'invented'})]
end
end
I get all rows in A, including the ones I do not want to index (marked with an asterisk)
Var1 Var2 Var3
____ ____ _____
a 1 217 2.89
aardvark 1 -1 9.89
abacus 1 -1 8.69
*abalone 1 -1 12.23
*abandon 1 395 8.32
invented NaN NaN NaN
Also, I found that even though I input two or more words that are not in A. The error I get is only for the first word MATLAB cannot find in A. Can I use a for loop for this? Can you give me some hints as to how to use it, please?

Accedi per commentare.


Steven Lord
Steven Lord il 29 Ago 2017
If all the types of variables in your table support missing (or NaN, if you have just numeric data and you're using a release that predates the introduction of missing in release R2017a) you can do this in a couple lines of code.
% Create a sample table with numeric, string, and datetime data
T = table([1; 2; 3], ["apple"; "banana"; "cherry"], datetime('today')+days([-1; 0; 1]), ...
'RowNames', {'a'; 'b'; 'c'})
% I want a new table with some rows whose names are in T and some that are not
desiredNames = {'b'; 'd'};
% Which ones are in T?
testNames = ismember(desiredNames, T.Properties.RowNames);
% Start by copying the rows that are in T to a new table
T2 = T(desiredNames(testNames), :);
% Fill the rows of the new table that correspond to names NOT in T with missing
T2{desiredNames(~testNames), :} = missing;
Note that I used parentheses on the next-to-last line and curly braces on the last line; they do behave differently.
If T only contained the numeric and string data variables Var1 and Var2, you could replace missing in that last line with NaN.
If you wanted to turn the last three lines into a function as Kristen suggested you could. T and desiredNames would be the inputs to that function and T2 the output. Add error checking as desired.
  2 Commenti
Adria Rofes
Adria Rofes il 5 Set 2017
Modificato: Adria Rofes il 5 Set 2017
Thanks a lot, Steven. I made the code work to search from a database 'Megadatabase.txt' (attached)
However,
the final table 'T2' shows me the results of the search by alphabetical order: first for the items in 'T' and then for the items 'not in T'.
I wonder if there is a way to order the items in T2, as I input them in 'desiredNames'
This is the code I used:
%Input the database and create a table
T = readtable ('MegaDatabase.txt', 'ReadRowNames', true);
% I want a new table with some rows whose names are in T and some that are not
desiredNames = {'b'; 'a'; 'b'; 'd'; 'abandon'; 'work'};
% Which ones are in T?
testNames = ismember(desiredNames, T.Properties.RowNames);
% Start by copying the rows that are in T to a new table
T2 = T(desiredNames(testNames), :);
% Fill the rows of the new table that correspond to names NOT in T with NaN
T2{desiredNames(~testNames), :} = NaN;
Steven Lord
Steven Lord il 5 Set 2017
Indexing.
T2 = T2(desiredNames, :)
Note that you have some row names that are duplicates, and table indexing will automatically modify the row names to make them unique. In this case I believe the third row of T2 will have name b_1 rather than b, since the first row of T2 already used the name b.

Accedi per commentare.


Peter Perkins
Peter Perkins il 31 Ago 2017
Row names in tables don't work like that in subscripting, but getting what you want is much easier than you think. Given this table ...
>> t = table([1;2;3],[4;5;6],'RowNames',{'A' 'B' 'C'})
t =
3×2 table
Var1 Var2
____ ____
A 1 4
B 2 5
C 3 6
... these two things are equivalent:
>> t('A',:)
ans =
1×2 table
Var1 Var2
____ ____
A 1 4
>> t(strcmp(t.Properties.RowNames,'A'),:)
ans =
1×2 table
Var1 Var2
____ ____
A 1 4
But the second version extends to the behavior that you want:
>> t(strcmp(t.Properties.RowNames,'X'),:)
ans =
0×2 empty table

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by