GPU and convn of a matrix with many other matrices

2 visualizzazioni (ultimi 30 giorni)
Ayal
Ayal il 27 Lug 2013
Risposto: sizhuo liu il 12 Apr 2019
Hello I wrote a code with requires hundreds of thousands of calls to convn. According to the profiler, convn takes over 80% of my code's run time!
I'm trying to get rid of loops. for example, I want to apply many convolution masks to a single matrix. Right now, I am looping. let MyArray be a matrix of size [s s BatchNum] (I convolve a batch of matrices at once instead of doing this for each matrix separately. It's about 500 at a time)
for i=1:length
res(:,:,:,i)=convn(MyArray,convMask(:,:,i));
end
The matrices are small - s=1 to 9 right now (won't go past 21. no need), so total of about [9, 9, 500] in size. I use a few dozen (at most) convolution masks, each of size < 9.
I want to get rid of the loop - make it even faster. but I don't know how to convolve with many masks at once.
The next optimization I plan after this one is moving the computations to the GPU. all the operations I need to do on the matrices are fully supported on the GPU (And - please correct me if i'm wrong - looping shouldn't be a problem on the GPU)
But I don't fully understand how the storage of information on the GPU works: If a have a class (inheriting from handle), can I move most of the variables in the structure to the GPU in the constructor, and keep them there even as I pass an instance of the class from function to function? If the data remains on the GPU, I assume the convolutions will go faster.
Does parallelizing operations for the GPU have as much overhead as parfor? (I don't even know if parfor is what's used to parallelize on the GPU)
Thank you, Ayal Shwartz

Risposte (1)

sizhuo liu
sizhuo liu il 12 Apr 2019
I have almost the same problem.
I found that for loop of convn on GPU can be slower than my CPU... Still trying ..,

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by