How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

Question

Leon il 19 Ott 2021

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the

Commentato: Leon il 19 Ott 2021

I have a matrix as below:

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7];

My goal is to identify rows with both identical Column 1 and identical Column 3 values, and then calculate the average for the rest of the columns, i.e., Column 2 and Column 4, within these duplicate rows. In this example, the duplicate rows would be Rows # 1, 3, and 5. My ending matrix would be:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3];

This is a much simplified example. In reality, I have 35 columns that need to be averaged, and millions of rows. What is the most efficient way of handling this? Do I have to write a loop and process each of the unique rows individually?

Many thanks!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Kevin Holly il 19 Ott 2021

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the#answer_812193

Modificato: Kevin Holly il 19 Ott 2021

Apri in MATLAB Online

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7]
A = 5×4
   4     3     8
   5     6     9
   6     3     6
   6     9     3
   5     3     7

I am going to assume that any row that columns 1 and 3 are identical, irregardless of what pair, you want to ignore those rows in the averaging of other columns.

Here is my approach:

t = table(A(:,1),A(:,3))
t = 5×2 table
    Var1    Var2
    ____    ____

     1       3  
     4       6  
     1       3  
     2       9  
     1       3  
[C, ia, ic] = unique(t,'rows')
C = 3×2 table
    Var1    Var2
    ____    ____

     1       3  
     2       9  
     4       6  
ia = 3×1
     1
     4
     2
ic = 5×1
     1
     3
     1
     2
     1
ic==1
ans = 5×1 logical array
   1
   0
   1
   0
   1
A(ic==1,:)
ans = 3×4
     1     4     3     8
     1     6     3     6
     1     5     3     7
mean(A(ic==1,:))
ans = 1×4
     1     5     3     7
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Your answer:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Without showing work:

t = table(A(:,1),A(:,3));
[~,~,ic] = unique(t,'rows');
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Leon il 19 Ott 2021

Many thanks for the solution!

Accedi per commentare.

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti