MATLAB Answers

0

How to do a right join based on multiple columns as ID?

Asked by Clarisha Nijman on 5 Nov 2018
Latest activity Commented on by Clarisha Nijman on 9 Nov 2018
Dear all,
I want to join set B from the right with set A, based on the first three columns. (The values of the three columns are the ID of both sets.) Sorry, I do not know the terminology in matlab).
All the explanations I do find are expressed in terms of keys variables, variable names and it is really confusing.
Can somebody give me pls some instructions/suggestions?
Given data set A A=[1 2 3 9 8; 1 2 3 3 12; 1 4 3 11 12; 2 3 4 3 12]
and data set B: B=[2 3 4 0 0; 1 2 3 9 8; 1 3 4 1 1; 1 2 3 6 7]
The key variables in each one of these sets are for each line the first three values.
The result should be: [2 3 4 0 0 3 12; 1 2 3 9 8 9 8; 1 2 3 9 8 3 12; 1 3 4 1 1 nAn nAn; 1 2 3 6 7 5 8; 1 2 3 6 7 3 12];

  1 Comment

All the explanations I do find are expressed in terms of keys variables, variable names and it is really confusing
That is because the data is stored in tables, not matrices. Usually, when columns of a matrix represent different things, it's better to store the data in a table which allows for easier manipulation.

Sign in to comment.

1 Answer

Answer by Star Strider
on 5 Nov 2018
Edited by Star Strider
on 5 Nov 2018
 Accepted Answer

This comes close to doing what you want:
A=[1 2 3 9 8; 1 2 3 3 12; 1 4 3 11 12; 2 3 4 3 12];
B=[2 3 4 0 0; 1 2 3 9 8; 1 3 4 1 1; 1 2 3 6 7];
Ac = mat2cell(A, ones(size(A,1),1), [3 1 1]);
Bc = mat2cell(B, ones(size(A,1),1), [3 1 1]);
TA = cell2table(Ac , 'VariableNames',{'ID','Var1','Var2'});
TB = cell2table(Bc , 'VariableNames',{'ID','Var1','Var2'});
TJ = outerjoin(TA,TB, 'MergeKeys',1, 'Keys','ID', 'LeftVariables',{'ID','Var1','Var2'}, 'RightVariables',{'ID','Var1','Var2'})
Result:
TJ =
7×5 table
ID Var1_TA Var2_TA Var1_TB Var2_TB
___________ _______ _______ _______ _______
1 2 3 9 8 9 8
1 2 3 9 8 6 7
1 2 3 3 12 9 8
1 2 3 3 12 6 7
1 3 4 NaN NaN 1 1
1 4 3 11 12 NaN NaN
2 3 4 3 12 0 0
EDIT Corrected typographical error in the explanation. Code unchanged.

  10 Comments

Can you give some explanation about the arguments in the array2table code? I am getting this error:
Error using array2table (line 62) The VariableNames property must be a cell array, with each element containing one nonempty character vector.
Error in kladje (line 13) TA = array2table(A, 'VariableNames', [compose('ID%d', 1:3), {'Var1', 'Var2'}]);
Obviously, the cell array that you give for the 'VariableNames' optional argument must have as many elements that there are columns in your matrix. My [compose('ID%d', 1:3), {'Var1', 'Var2'}] generates 5 variable names, if your actual array has more columns, you'll have to adjust that bit.
Thank a lot for the explanation!

Sign in to comment.