More Efficient ismember Calculation

Question

Ayden Clay il 16 Apr 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/518357-more-efficient-ismember-calculation

Commentato: Ayden Clay il 20 Apr 2020

Hello, I'm working on an interpolation for a specific project that I'm working on and have managed to produce a very fast interpolation function for N-D array for042. There is one final bottleneck that is taking up 84% of the total run-time of the program. Inserted are the profiler and exemplar code, any and all optimizations are greatly appreciated, but the particular line is obvious from the profiler.

alpha = -5:5:30;
mach  = 1:6:25;
beta = -5:2.5:5;
num = zeros(length(alpha),length(mach),length(beta));
for cnt = 1:numel(alpha)
    for cnt1 = 1:numel(mach)
        for cnt2 = 1:length(beta)
            num(cnt,cnt1,cnt2) = 3*alpha(cnt)+mach(cnt1)/2+beta(cnt2);
        end
    end
end
for cnt2 = 1:length(beta)
    data.for006{cnt2}.cd = num(:,:,cnt2);
end

Is an example of the data set that we're working with.

And the line in question from the profiler is part of this block:

a = zeros(1,2^size(CURRENT,1)-1);
curridx = zeros(1,size(CURRENT,1));
for cnt = 1:(2^size(CURRENT,1))
    change = find((rem(cnt-1,2.^(0:size(CURRENT,1)-1))==0)==1);
    curridx(change) = 1*curridx(change)==0;
    idx = lidx.*(curridx==0) + uidx.*(curridx==1);
    chkvals = zeros(1,length(idx));
    for cnt1 = 1:length(idx)
        B = RANGES{cnt1};
        chkvals(cnt1) = B(idx(cnt1));
    end
    chkvals = [chkvals(2),chkvals(4:end)];
% INEFFICIENT LINE %
    a(cnt) = DATA.for042{find(ismember(DATA.permutation,chkvals,'rows'))}.cn(sub2ind_a(siz,idx));
end

11 Commenti
Mostra 9 commenti meno recentiNascondi 9 commenti meno recenti

Walter Roberson il 16 Apr 2020

Apri in MATLAB Online

Guidelines for using find():

If you have a relational test that is used exactly once, and you find(), and you use the result of the find() only to index an array: then skip the find() and use the result of the relational test as a logical index.
if you are doing computation on the indices returned by find() then it might be worth retaining the find(). For example you might be wanting to compute the distance between events
If you are updating corresponding locations, matrix(locations) = f(matrix(locations)) then one could hypothesize that taking find(locations) and using that might in some cases be faster than using logical indexing, because there would be less for the assignment to examine (just change a few locations directly, right?) . However in my tests with large arrays, using logical indexing is faster even for a small number of output locations, and is notably faster if there are a large number of output locations.

Note: this timing test takes a few minutes to execute due to the size of the arrays. For each test, a new output array the same size as the input has to be created.

data = rand(83,19,207,51,3);
data(12345) = -1;
data(876543) = -1;
N = 10;
t1 = zeros(N,1);
t2 = zeros(N,1);
t3 = zeros(N,1);
t4 = zeros(N,1);
f1 = @()fun1(data);
f2 = @()fun2(data);
f3 = @()fun3(data);
f4 = @()fun4(data);
for K = 1 : N; t1(K) = timeit(f1,0); end
for K = 1 : N; t2(K) = timeit(f2,0); end
for K = 1 : N; t3(K) = timeit(f3,0); end
for K = 1 : N; t4(K) = timeit(f4,0); end
plot([t1,t2,t3,t4]);
legend({'find0', 'mask0', 'find5', 'mask5'});
m1 = mean(t1); m2 = mean(t2); m3 = mean(t3); m4 = mean(t4);
ms = [m1;m2;m3;m4];
disp('timings')
disp(ms);
m = min(ms);
disp('ratios')
disp(ms ./ m);
function fun1(data)
    %small number of changes, find
    idx = find(data<0);
    data(idx) = data(idx) * 2;  %#ok<NASGU>
end
function fun2(data)
    %small number of changes, logical indexing
    idx = data<0;
    data(idx) = data(idx) * 2; %#ok<NASGU>
end
function fun3(data)
    %many places, find
    idx = find(data<0.5);
    data(idx) = data(idx) * 2; %#ok<NASGU>
end
function fun4(data)
    %many places, logical indexing
    idx = data<0.5;
    data(idx) = data(idx) * 2; %#ok<NASGU>
end

Walter Roberson il 17 Apr 2020

But the length of idx does not depend upon the contents of CURRENT, only on the size of CURRENT, right? So you can pre-compute it.

In the great majority of cases, if you can move a computation out of a loop, doing so will result in more efficient code. This is not always the case, but most of the time.

Ayden Clay il 20 Apr 2020

ahh, I understand! I've now, I believe, moved as much as I can outside of loops.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Steven Lord il 16 Apr 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/518357-more-efficient-ismember-calculation#answer_426435

Apri in MATLAB Online

What's the most time consuming part of that line that you've identified as the bottleneck? Is it the ismember call, the indexing into DATA.for042, the indexing into the result of that indexing to retrieve part of the cn field, or the assignment into a section of a? To tell, break that into four parts (for performance profiling purposes.)

ind = ismember(DATA.permutation,chkvals,'rows');
data1 = DATA.for042{ind};
data2 = data1.cn(sub2ind_a(siz,idx));
a(cnt) = data2;

My guess is that the ismember call still might be the most time consuming part of that process, but it's not going to be all 83.4% of the total runtime of your code.

6 Commenti
Mostra 4 commenti meno recentiNascondi 4 commenti meno recenti

Walter Roberson il 17 Apr 2020

But you should still make the other improvements I noted about not computing the same value multiple times.

Ayden Clay il 20 Apr 2020

I completely agree, I've implemented a large number of those changes too (there may be more), I've tried to remove some of the for loops in favour of vector operations. There is still some work to be done, but this is much closer to what I needed. Thank you.

Accedi per commentare.

More Efficient ismember Calculation

11 Commenti
Mostra 9 commenti meno recentiNascondi 9 commenti meno recenti

Risposta accettata

6 Commenti
Mostra 4 commenti meno recentiNascondi 4 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

More Efficient ismember Calculation

11 Commenti Mostra 9 commenti meno recentiNascondi 9 commenti meno recenti

Risposta accettata

6 Commenti Mostra 4 commenti meno recentiNascondi 4 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

11 Commenti
Mostra 9 commenti meno recentiNascondi 9 commenti meno recenti

6 Commenti
Mostra 4 commenti meno recentiNascondi 4 commenti meno recenti