Array assigning performance, : vs indexes

Question

FINNSTAR7 il 29 Mag 2021

1
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/842785-array-assigning-performance-vs-indexes

Modificato: FINNSTAR7 il 29 Mag 2021

MATLAB R2018a

So recently I was playing around with some code, and in one section I have a long running for loop (N = 1,000,000+ iterations). Inside the loop I am altering the values of a 2xN array one column at a time, like so:

tic;
% blah blah blah
for i = 2:N
    % blah blah blah...
    % a is the 2xN array
    % xy is just a 2x1 column vector
    a(:, i) = xy;
end
toc

This works fine and dandy; I'm seeing run times of around 0.46 - 0.48 seconds on average. But then I changed the loop to this:

tic;
% blah blah blah
for i = 2:N
    % blah blah blah...
    a(1, i) = xy(1);
    a(2, i) = xy(2);
end
toc

And suddenly I'm seeing a slight, but still significant increase in speed (0.41 - 0.43 seconds). This seems a bit counterintuitive to me, as I was expecting these to perform pretty much the same, if not the latter being slightly slower due to the extra array call. So I'm wondering why it is that those two explicit assignments are faster than a single bulk assignment?

And with the exception of structure and readability, is it better to just always explicitly assign values like this rather than use the : syntax?

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

DGM il 29 Mag 2021

Apri in MATLAB Online

I know someone else can answer this from a deeper background in the internals of how Matlab works, but it's easy enough to test it. The results might not be guaranteed to be comparable on all other versions, but it might help give a rough idea what can be expected.

With a simple loop populating a preallocated Mx1000 array with a column vector, executed 10E3 times to average the execution time, I notice a few things:

% for M = 2
A(:,n) = thiscolumnvector;  % this is slowest
A(1,n) = thiscolumnvector(1); % this takes about half the time
A(2,n) = thiscolumnvector(2);
A(1,n) = thisscalarvalue; % this is even faster
A(2,n) = thatscalarvalue;

However, as M increases, the speed advantage between cases 1 and 2 diminishes rapidly, becoming roughly equal for M somewhere around 10-20. Again, that's just with my particular test case on my computer with the specific version i was running, using the particular datatype I used. Your results may vary, but it's probably safe to expect that you'll reach a point where the second case starts being the slower option.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Jan il 29 Mag 2021

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/842785-array-assigning-performance-vs-indexes#answer_712040

Modificato: Jan il 29 Mag 2021

Apri in MATLAB Online

The performance of loops depends on the JIT acceleration. This is a smart tool, which can recognize repeated commands, reorder them an apply some abbreviations.

The JIT is not completely documented on purpose, because MathWorks does not want the users to optimize their code for the JIT, but to optimize the JIT to the code of the users. Therefore the speed improvements of a(:,i)=xy compared to a(1,i)=xy(1), a(2,i)=xy(2) can vary between Matlab versions.

Sometimes the JIT is blocked or impeded, e.g. if some variables are created dynamically. A load() without catching the output or an eval 200 lines before the loop can degrade the processing speed massively.

Remember the good programming practice for the optimization of code:

Write correctly running code and debug and test it exhaustively. If a code produce wrong results, the speed does not matter.
Document the code exhaustively. If it is fast, allow for re-using it in other projects.
Use the profiler to finde the bottlenecks. It is a waste of time to improve the speed of a subfunction, which takes only 0.5% of the total runtime. Unfortunately the profiler disables important parts of the JIT, so some tic/toc or timeit measurements are required also.
Improve the bottlenecks and compare the results exhaustively with the reference solution.
Document, why you have modified the code and in which Matlab version this is an advantage. Ship the original codes together with your projects, because maybe in a later version of Matlab, they are faster than the optimized version.

Checking, what the JIT does with the code, would be a reverse engnieering, which is forbidden by the license agreement. I guess, that for

a(:, i) = xy;

Matlab has to check in each iteration, if size(a,1) has the same length as xy. With the copy of scalars this test can be omitted. On the other side Matlab needs to test if xy has two elements, but this can be recognized by the JIT automagically. If the JIT would be very powerful, it could insert the values of xy directly into a, if xy is not used outside the loop anymore.