Optimized code for loop, If-statement for large dataset

1 visualizzazione (ultimi 30 giorni)
I was hoping to delete some certain rows using condion. My data is in double format (790127*24) I approximate the total code need 25 hours using Run and Time which is huge. Is there any way of optimiing the script.
TIA...
n=0;
for i = 1 : length(d_A)
if any(isnan(d_A(i-n, 6))) ...
&& any(isnan(d_A(i-n, 7))) ...
&& any(isnan(d_A(i-n, 8))) ...
&& any(isnan(d_A(i-n, 9))) ...
&& any(isnan(d_A(i-n,10))) ...
&& any(isnan(d_A(i-n,11))) ...
&& any(isnan(d_A(i-n,12))) ...
&& any(isnan(d_A(i-n,13))) ...
&& any(isnan(d_A(i-n,14))) ...
&& any(isnan(d_A(i-n,15))) ...
&& any(isnan(d_A(i-n,16))) ...
&& any(isnan(d_A(i-n,17))) ...
&& any(isnan(d_A(i-n,18))) ...
&& any(isnan(d_A(i-n,19)))
d_A(i-n,:) = [];
n=n+1;
end
end

Risposta accettata

per isakson
per isakson il 25 Mag 2019
Modificato: per isakson il 26 Mag 2019
Try this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
d_A(i-n,:) = [];
n=n+1;
end
end
and this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i-n) = true;
n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
Caveat: not tested
In response to comment:
Now it's possible to factor out the for-loop. Try this
%% Sample data
A = rand( [8,4] );
ixcol = [2,3];
A([3,5],ixcol) = nan;
A( randperm( numel(A), 9 ) ) = nan;
%%
[ A3, ix_deleted3 ] = cssm_3( A, ixcol );
[ A4, ix_deleted4 ] = cssm_4( A, ixcol );
ix_deleted3 == ix_deleted4 %#ok<NOPTS,EQEFF>
function [ A, ix_deleted ] = cssm_3( A, ixcol )
is_to_be_deleted = false( size(A,1), 1 );
for jj = 1 : length(A)
if all( isnan( A( jj, ixcol )))
is_to_be_deleted(jj) = true;
end
end
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
function [ A, ix_deleted ] = cssm_4( A, ixcol )
is_to_be_deleted = all( isnan( A( :, ixcol ) ), 2 );
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
it outputs
>> cssm
ans =
2×1 logical array
1
1
The vectorized version, cssm_4, might not improve performance significantly, but in my opinion it makes cleaner code.
  2 Commenti
Jhon Gray
Jhon Gray il 25 Mag 2019
Modificato: Jhon Gray il 25 Mag 2019
Wow. The second one is super fast.But there's a little bit problem here. The code would be like this.No need of i-n in this case.Thanks for helping.Take love.
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : 1000 %length(d_A)
if all( isnan( d_A( i, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i) = true;
%n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
per isakson
per isakson il 26 Mag 2019
Modificato: per isakson il 26 Mag 2019
I surmised that there was a problem and added the last line in bold.
It's as a bad for performance to remove one line at a time as adding one line at a time. In both cases the matrix is rewritten to memory in each operation.

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Matrices and Arrays in Help Center e File Exchange

Prodotti


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by