calculating the mean for each column in a numerical array based on the elements in column 1

Question

Ziad Sari El Dine il 27 Mag 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1728690-calculating-the-mean-for-each-column-in-a-numerical-array-based-on-the-elements-in-column-1

Modificato: Jan il 27 Mag 2022

I have a numerical array (8167x11). The first column has numbers from 1 to 198 in ascending order(each number is repeated several times, the number of repetitions of each is random however they are sorted in ascending order). I need to calculate the mean of the numbers in each column seperately (2 to 11) that correspond to each number in column 1. So, the output must be an array 198x11 where column 1 contains the numbers 1:198 and each of the other columns contain the means of the numbers corresponding to each element in column 1.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Ziad Sari El Dine il 27 Mag 2022

There are a few numbers missing between 1 and 198 in column 1. Is there a way to fill in the gaps with the missing numbers and having the rest of the row filled with nan's or zeros?

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Jan il 27 Mag 2022

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1728690-calculating-the-mean-for-each-column-in-a-numerical-array-based-on-the-elements-in-column-1#answer_973070

Modificato: Jan il 27 Mag 2022

Apri in MATLAB Online

With a simple loop:

A = [randi([1, 198], 8167, 1), rand(8167, 10)];
result = zeros(198, 11);
for k = 1:198
    match        = A(:, 1) == k;
    result(k, :) = mean(A(match, :), 1);
end

This takes about the same time as splitapply. A faster appraoch:

% Sort A according to first element:
[~, ind] = sort(A(:, 1));
B        = A(ind, :);
% Determine, where the elements in the first row change:
d = [true, diff(B(:, 1)).' ~= 0, true];  % TRUE at changes
c = find(d);  % Indices where block change
% Loop over keys:
result = zeros(198, 11);
for k = 1:198
  nk           = c(k+1) - c(k);  % Number of same keys
  % Mean over block with same keys:
  result(k, :) = sum(B(c(k):c(k+1)-1, :), 1) / nk;
end

For a test data set:

A = [randi([1, 198], 8167, 1), rand(8167, 10)];

this needs about 0.0019 seconds, while splitapply needs 0.0066 seconds (Matlab R2018b).

Note: sum(X,1) / nX is faster than mean(X,1).

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Ziad Sari El Dine il 27 Mag 2022

Perfect! Thanks.

Accedi per commentare.

Answer 2

Matt J il 27 Mag 2022

2
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1728690-calculating-the-mean-for-each-column-in-a-numerical-array-based-on-the-elements-in-column-1#answer_973065

Modificato: Matt J il 27 Mag 2022

Apri in MATLAB Online

Let's call your matrix A. Then,

out = splitapply(@(z) mean(z,1),A,A(:,1));

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Jan il 27 Mag 2022

Apri in MATLAB Online

If one of the groups contains 1 row only, mean operates on the 2nd dimension automatically. So to be sure specify the dimension to build the mean over:

A = [1, 2, 3, 4; ...
     1, 5, 6, 7; ...
     2, 1, 1, 1; ...
     1, 4, 2, 1];
out = splitapply(@(x) mean(x, 1), A, A(:,1))
out = 2×4
    1.0000    3.6667    3.6667    4.0000
    2.0000    1.0000    1.0000    1.0000

Matt J il 27 Mag 2022

Modificato: Matt J il 27 Mag 2022

Apri in MATLAB Online

Is there a way to fill in the gaps with the missing numbers

Do you really need/want the gaps filled in? If you exclude the missing numbers, the modification is easy:

out = splitapply(@(z) mean(z,1),A,findgroups( (A(:,1) ));

If you must have the gaps filled in, it's a few additional steps:

out_with_nans=nan(198,11);
out_with_nans(round(out(:,1)),:)=out;
out_with_nans(:,1)=1:198;

Accedi per commentare.

calculating the mean for each column in a numerical array based on the elements in column 1

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (1)

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

calculating the mean for each column in a numerical array based on the elements in column 1

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (1)

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente