How do I vectorize my nested for loops for a convolution operation example?

18 visualizzazioni (ultimi 30 giorni)
I created a convolution neural network and it required a lot of for loops which resulted in long running time. I would like to speed up the calculation process and after was told vectorization is a great way to do that. However I am confused about how to do this because I am extracting smaller matrix sizes of a larger matrix which complicates everything. Any help would be greatly appreciated.
%% initialization set up
bias_1 = zeros(1,1,16);
kernel_1 = -1+2*rand([3,3,16]);
kernelSize_1 = size(kernel_1,2);
stride_1 = 1;
k_start_1 = 1; %kernel start
k_end_1 = 3; %end size of kernel = length(kernel_1)
X = rand(28,28); %normally its an image but for simplicity i just made it random numbers
% Hyperparameters setup: padding,
pad_1 = (kernelSize_1-1)/2; % Pad type: "same"
padValue_1 = 0; % Zero padding
X_padded_1 = padarray(X,[pad_1,pad_1],padValue_1); % padded image [30x30]
%---Calculated-expected-outputsize-of-the-convolution-operation------------
img_height_1 = size(X,2); % input image height = 28
img_width_1 = size(X,2); % input image width = 28
outputHeight_1 = floor((img_height_1 + 2*pad_1 - kernelSize_1)/stride_1 + 1); % calculated output height = 28
outputWidth_1 = floor((img_width_1 + 2*pad_1 - kernelSize_1)/stride_1 + 1); % calculated output width = 28
conv1_output = zeros(outputHeight_1,outputWidth_1,16); % pre-allocated zeros matrix for desired output [28x28x8]
%% convolution operation
for c = 1:16
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, (j-1) + k_start_1: (j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
S1_cij = sum(featureMap_1.*kernel_1(:,:,c),'all') + bias_1(1,1,c);
% calculated output with the ReLU activation function
conv1_output(i,j,c) = max(0,S1_cij); % [28x28x8]
end
end
end

Risposta accettata

Jan
Jan il 26 Apr 2022
Not a vectorization, but some tiny changes, which let the code tun 30% faster (at least in Matlab online - test this locally!):
for c = 1:16
tmp1 = kernel_1(:,:,c);
tmp2 = bias_1(1,1,c);
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, ...
(j-1) + k_start_1:(j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
conv1_output(i,j,c) = sum(featureMap_1 .* tmp1, 'all') + tmp2;
end
end
end
conv1_output = max(0, conv1_output);
Elapsed time is 0.031382 seconds.
  2 Commenti
Jan
Jan il 27 Apr 2022
for c = 1:16
tmp1 = kernel_1(:,:,c);
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, ...
(j-1) + k_start_1:(j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
conv1_output(i,j,c) = sum(featureMap_1 .* tmp1, 'all');
end
end
end
conv1_output = max(0, conv1_output + bias_1);
I've moved the addition of the bias out of the loop. Now the rest looks like a job for conv or even conv2.
I try it in the evening again.

Accedi per commentare.

Più risposte (0)

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by