Finding and saving identical rows in a matrix

Asked by mr mo

mr mo (view profile)

on 3 Nov 2017
Latest activity Commented on by mr mo

mr mo (view profile)

on 4 Nov 2017
Accepted Answer by Cedric Wannaz

Cedric Wannaz (view profile)

Hi, Suppose I have an (m*n) matrix A, e.g.
A=[
3 2 3.5
2 2 3.5
4 2 3.5
2 3 3.5
3 3 3.5
4 3 3.5
2 4 3.5
3 4 3.5
4 4 3.5
3 2 4.5
2 2 4.5
4 2 4.5
2 3 4.5
3 3 4.5
4 3 4.5
2 4 4.5
3 4 4.5
4 4 4.5
3 2 5.5
2 2 5.5
4 2 5.5
2 3 5.5
3 3 5.5
4 3 5.5
2 4 5.5
3 4 5.5
4 4 5.5
12 22 7.5
13 16 8.9];
Also I made an image of Matrix A below. There are rows that the first and second members in these rows are identical to the first and second members in other rows respectively, for example
[A(1,1) , A(1,2)]==[A(10,1) , A(10,2)]==[A(19,1) , A(19,2)] that were highlighted in orange color in the above image, also the other identical members in the first and second columns in different rows are highlighted in same colors.
The 28th and 29th rows that have no identical rows in the matrix A didn't highlighted with colors.
I want to find a way to save these identical rows with their members in all columns in different new matrices separately.
We have 9 different colors here, so we must have 9 New_A matrices.
Also the 28th and 29th rows don't have identical rows in matrix A, so I want to save the 28th and 29th rows in a single new matrix for example matrix named B.
At the end I want to reach these matrices that are shown below
New_A1=[3 2 3.5;
3 2 4.5;
3 2 5.5];
New_A2=[2 2 3.5;
2 2 4.5;
2 2 5.5];
New_A3=[4 2 3.5;
4 2 4.5;
4 2 5.5];
New_A4=[2 3 3.5;
2 3 4.5;
2 3 5.5];
New_A5=[3 3 3.5;
3 3 4.5;
3 3 5.5];
New_A6=[4 3 3.5;
4 3 4.5;
4 3 5.5];
New_A7=[2 4 3.5;
2 4 4.5;
2 4 5.5];
New_A8=[3 4 3.5;
3 4 4.5;
3 4 5.5];
New_A9=[4 4 3.5;
4 4 4.5;
4 4 5.5];
B=[12 22 7.5;
13 16 8.9];
I was wondering if anyone has any idea on how to do that? thank you for your help.
And is there any code to tell how many identical rows and how many different rows are in the matrix A ?
for example in matrix A there are 9 identical rows and 2 different rows

Answer by Cedric Wannaz

Cedric Wannaz (view profile)

on 4 Nov 2017

Alternatively:
[~, ~, ic] = unique( A(:,1:2), 'rows' ) ;
groups = splitapply( @(x){x}, A, ic ) ;
produces
groups =
11×1 cell array
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{3×3 double}
{1×3 double}
{1×3 double}
and then
isAlone = cellfun( @(x) size(x,1), groups ) == 1 ;
merged = vertcat( groups{isAlone} ) ;
groups = groups(~isAlone) ;
where groups is the cell array of all groups of rows that are not unique, and merged is a merge of all others.

mr mo

mr mo (view profile)

on 4 Nov 2017
Thank you very much for your help. How can I change these cell arrays ino double at the end Matrix by Matrix ?
Cedric Wannaz

Cedric Wannaz (view profile)

on 4 Nov 2017
A cell array is an array of cells. Variable groups is a cell array (just one), and it contains nine cells. Each cell contains a numeric array (a matrix).
Block-indexing a cell array (the usual indexing with parentheses) returns cells (or a single cell), not their/its content. Block-indexing a cell array hence returns another cell array (a block of the original cell array).
>> groups(1)
ans =
1×1 cell array
{3×3 double}
>> class( groups(1) )
ans =
'cell'
Usually we need to access cells' content, however. This is done using curly-bracket indexing:
>> groups{1}
ans =
2.0000 2.0000 3.5000
2.0000 2.0000 4.5000
2.0000 2.0000 5.5000
>> class( groups{1} )
ans =
'double'
So groups(1) is cell 1 of the groups cell array, and groups{1} is its content.
The easiest way to work with your data is to keep the cell array. It is already well suited for iterating through groups for example:
for gId = 1 : numel( groups )
disp( det( groups{gId} )) ;
end
which you could not do if you had saved groups in e.g.
A = groups{1} ;
B = groups{2} ;
...
mr mo

mr mo (view profile)

on 4 Nov 2017
Thank you very much.

Answer by per isakson

per isakson (view profile)

on 3 Nov 2017
Edited by per isakson

per isakson (view profile)

on 4 Nov 2017

I have an idea and that's (the two leftmost columns contain whole numbers)
>> [C,ia,ic] = unique(A(:,1:2),'rows');
>> A(ic==1,:)
ans =
2.0000 2.0000 3.5000
2.0000 2.0000 4.5000
2.0000 2.0000 5.5000
where there is more than one occurrence of 1 in ic
Loop over all numbers with more than one occurrence in ic
Those with one occurrence make up B
Finally, "New_A9" forces me to refer you to TUTORIAL: Why Variables Should Not Be Named Dynamically (eval) (appropriate smiley)
.
Implementation with releases older than R2015b (see Cedrics answer for newer releases)
[~,ia,ic] = unique(A(:,1:2),'rows');
N = histc( ic, 0.5+(0:length(ia)) );
N = reshape( N, 1, [] );
new = cell( 1, sum(N>=2) );
B = nan( sum(N==1), 3 );
ix = 0;
for jj = find( N>=2 )
ix = ix + 1;
new{ix} = A( ic==jj, : );
end
ix = 0;
for jj = find( N==1 )
ix = ix + 1;
B(ix,:) = A( ic==jj, : );
end
It reproduces the output of your example. Might need more testing.
>> whos new B
Name Size Bytes Class Attributes
B 2x3 48 double
new 1x9 1656 cell
and
>> new{4}
ans =
3.0000 2.0000 3.5000
3.0000 2.0000 4.5000
3.0000 2.0000 5.5000
>> B
B =
12.0000 22.0000 7.5000
13.0000 16.0000 8.9000

mr mo

mr mo (view profile)

on 3 Nov 2017
Thanks again for your help. I want to do my own next process to this new data-set that we saved it in different cell arrays and Matrix named B. might be any mistakes happen If I change these cell arrays into double at the end of your code or not ?
per isakson

per isakson (view profile)

on 4 Nov 2017
"change these cell arrays into double" If all the cells contain matrices of the same size you can convert the cell array into a double array. In the example all the cells contain 3x3 matrices.
mr mo

mr mo (view profile)

on 4 Nov 2017
Thanks again for your help. I have a question. This code is written for Matrix A. Can I use this code for a New Matrix with new size and new members that are different from Matrix A ?