Optimize/speed up a big and slow matrix operation with addition and bsxfun?
Mostra commenti meno recenti
Hi. I have a fairly long line of code in my script that takes about 7 seconds to run, it was originally three separate calculations but are now done in one line that looks like this:
X = bsxfun(@times,reshape(bsxfun(@times,A,B),[1440,1,2251]),(1-C)./2)...
+bsxfun(@times,reshape(E,1440,1,2251),D)...
+bsxfun(@times,reshape(F,1440,1,[]),((1+C)/2));
Since I need to run it several tens of thousands times it is causing a fair amount of anxiety in my life at the moment. I don’t know how I would go about speeding it up further or if it is even possible. So I would really appreciate it if any of you optimization gurus here could give me some advice on how and if I can go about making this faster.
The input variables are of size:
A = 2251x1
B = 1440x2251
C = 1x181
D = 1440x181
E = 1440x2251
F = 1440x2251
Thanks!
2 Commenti
James Tursa
il 20 Apr 2015
Do you have a C compiler available for compiling mex functions?
James Tursa
il 20 Apr 2015
I assume A is really 1 x 2251 (otherwise bsxfun(@times,A,B) wouldn't work as written)
Risposta accettata
Più risposte (1)
John D'Errico
il 19 Apr 2015
You are trying to do billions of multiplies here, resulting in an array that is of size
1440*181*2251
ans =
586700640
So roughly 600 million elements. since they are double precision numbers, the array will require 8 bytes per element. So the final result will require nearly 5 gigabytes of RAM to store.
Oh, by the way, there are three parts to that computation, so your code must generate three such temporary arrays in that computation, that are then summed together. So the internal storage requirements might be as large as 15 gigabytes of RAM, depending on how that code is processed internally.
That it ran in 7 seconds is a testament to the speed of your system, not the slowness of the code.
If you want it to run faster, get more memory. A good idea would be to use a flash based hard drive to allow your CPU to access virtual memory much faster.
5 Commenti
John D'Errico
il 19 Apr 2015
I see no commonality in those terms to compress things down into something more efficient.
So first I'd watch your disk to see if it is swapping in that computation. My guess is it is being used if you are that close to the edge. That will cause a serious time loss. That you are seeing all your memory being used tells me that MATLAB is doing as I expected.
If there is disk thrashing involved, the simple answer is to get more RAM, OR a faster disk. RAM is cheap. A terabyte sized flash drive is relatively cheap too. Both are VERY cheap compared to YOUR time. People forget these things too often.
Of course, you could also write MEX code to do this, avoiding the creation of those large internal temporary arrays.
Perhaps another option, IF you can tolerate some precision loss in the result, is to convert the arrays to single precision. That will cut you RAM needs in half here, making it run MUCH faster. In fact, this will be a good test of my recommendation, since that computation will not need any swapping done.
Therefore, IF the computation runs much more quickly using singles versus doubles, then it says that the problem is insufficient RAM, NOT the number of computations that needed to be done. A single precision multiply or add seems to take about half the time as the same operation using doubles. But more importantly, if swapping can be avoided, then it will take MUCH less time.
A = single(rand(100,1,100));
B = single(rand(1,200,100));
timeit(@() bsxfun(@times,A,B))
ans =
0.0024701
Ad = double(A);
Bd = double(B);
timeit(@() bsxfun(@times,Ad,Bd))
ans =
0.0048494
So try converting your arrays to single precision. If the overall run time is cut by 50%, then you know that swapping is not a factor. But even then, you have a way to cut your time by roughly 50%.
If the run time is much less than 50% for the single computation, and you cannot afford the loss of precision, then you know that you really do need more memory and/or a faster drive.
PetterS
il 19 Apr 2015
John D'Errico
il 20 Apr 2015
The funny thing is, much computation in the old days was done using single precision. Double precision is more robust though. You can get away with things with doubles that might hurt you with single.
When you start swapping to disk, things suddenly get much slower. So my argument is that IF the time drops by 50%, then you were not doing any serious amount of swapping with the doubles.
Eps tells you about the potential error in those numbers.
eps('double')
ans =
2.22044604925031e-16
eps('single')
ans =
1.192093e-07
Eps is the smallest number you can add to 1, and get a different result. So think of eps as the granularity for that precision, the least significant bit.
Much will depend on what you are doing with those numbers. Is there noise in them anyway, so that trash down around 1 part in 1e7 won't bother you? What are you doing with this result afterwards? On some computations that noise won't matter a bit.
schrodinbug
il 10 Lug 2019
if you're creating a mex function would using a library like Eigen provide value?
Categorie
Scopri di più su Matrix Indexing in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!