How to replace zeros with other value in a big matrix fast

Hi all, I have a programming efficiency question. I would like to replace zero value of a big matrix with mean value of it. I know how to do it.However, it takes forever. I do not know if I did it wrong or it actually takes so much time. I am wondering if there is a better way to do it for a big matrix as big as 3320*3320. Here is my code. Assume A is the 3320*3320 matrix. I wrote only
A(A==0)=mean(mean(A));
Please let me know if I did it wrong. If I did it right, can somebody suggest any faster way to do the same thing ? Thank you very much,

 Risposta accettata

Don't double call these functions (MAX,MIN,SUM,MEAN,etc.). Learn to use the colon, it is your friend!
A(~A(:))=mean(A(:));
This will be somewhat faster. If you need even more speed, replace the call to mean with its definition.
A(~A(:)) = sum(A(:))/numel(A);
The thing is, with a very large array, it might just take some time!
Here is how I timed these, BTW. You can play around with other options as well! Run the function 3 times (the first time warms it up after a save). The time will print to the command window.
function [] = write_ones()
T1 = 0; % Time two different approches
T2 = 0;
for ii = 1:5
A = round(rand(3320))>.75;
tic
A(A==0)=mean(mean(A));
T1 = T1 + toc;
clear A
A = round(rand(3320))>.75; % A new A.
tic
A(~A(:)) = sum(A(:))/numel(A);
T2 = T2 + toc;
end
[T1 T2]

7 Commenti

Thank you very much.
Both of them are much faster than my original code.
But some unknown reason, the first way you provided is faster than the second one.
Anyway, both of them work. Thanks.:)
Does that mean you accept the answer? Please indicate this by clicking the appropriate button.
Matt, your A matrices have been inadvertently turned into logical matrices. I don't think this was your intention. I edited your code a bit to retain A as double, and then added a bare bones in-place mex approach as a third option (avoids the large intermediate logical array).
function [] = write_ones()
T1 = 0; % Time three different approches
T2 = 0;
T3 = 0;
for ii = 1:5
B = rand(3320);
B(B<.25) = 0;
A = B;
A(1) = A(1);
tic
A(A==0)=mean(mean(A));
T1 = T1 + toc;
A = B;
A(1) = A(1);
tic
A(~A(:)) = sum(A(:))/numel(A);
T2 = T2 + toc;
A = B;
A(1) = A(1);
tic
zeromean(A,sum(A(:))/numel(A));
T3 = T3 + toc;
end
[T1 T2 T3]
The source code for zeromean.c is:
/* zeromean(A,s) does the following in-place calculation: */
/* A(A==0) = s */
/* Where A = a double array */
/* s = a double scalar */
/* Programmer: James Tursa */
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, n;
double d;
double *pr;
n = mxGetNumberOfElements(prhs[0]);
pr = mxGetPr(prhs[0]);
d = mxGetScalar(prhs[1]);
for( i=0; i<n; i++ ) {
if( pr[i] == 0.0 ) {
pr[i] = d;
}
}
}
Nice MEX function James, as I would expect from you! And good catch too, I forgot that I had changed that earlier.
logical to double:
A = +(round(rand(3320))>.75);
James' Mex function profits from omitting the boundary checks, while Matlab seems to check the boundary for each single index - even for logical indexing. See http://www.mathworks.com/matlabcentral/newsreader/view_thread/295653 . Therefore a Mex can create partial copy using logical indexing about 3 times faster than in Matlab (Matlab 2009a, MSVC, even small arrays).
accpeted by JSimon

Accedi per commentare.

Più risposte (1)

Matt, I had originally thought the same thing (using a colon on the right hand side also might be faster), but I'm not able to reproduce your results: I consistently get that the second case is faster. (0.18s vs 0.15s)
function colontest
A = rand(3320); A(A < 0.9) = 0; B = A+0;
tic; A(~A(:)) = sum(A(:))/numel(A); toc;
tic; B(B == 0) = mean(mean(B)); toc;

5 Commenti

Oops I mean "left hand side"
Teja, I got the same thing. speed of more than a "~" than "=="
>> A = +(round(rand(3320))>.75);tic,A(A(:)==0) = mean(A(:));toc
Elapsed time is 0.238307 seconds.
>> A = +(round(rand(3320))>.75);tic,A(~A(:)) = mean(A(:));toc
Elapsed time is 0.305402 seconds.
>>
My observations suggest that MEAN(X) is parallelized in modern Matlab version: The columns are processed in different threads for large matrices, while MEAN(X(:)) seems to run in a single thread. Therefore MEAN(MEAN(X)) can be faster.
Dear Jan, my "research" :)
1. compare mean(...(:)) and mean(mean(...)) ->
>> A = +(round(rand(5000))>.75);tic,A(A(:)==0) = mean(mean(A));toc
Elapsed time is 0.528957 seconds.
>> A = +(round(rand(5000))>.75);tic,A(A(:)==0) = mean(A(:));toc
Elapsed time is 0.528977 seconds.
2. compare use '~' and '==' ->
>> A = +(round(rand(5000))>.75);tic,A(~A) = mean(mean(A));toc
Elapsed time is 0.669978 seconds.
>> A = +(round(rand(5000))>.75);tic,A(A==0) = mean(mean(A));toc
Elapsed time is 0.531413 seconds...

Accedi per commentare.

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by