How to compute inference time (ms) to compare my Original, Projected and Fine Tuned models? Error For code generation of convolution1dLayer, when convolving over the time dime

7 visualizzazioni (ultimi 30 giorni)
I am trying to compute the inference time of three different models (Original, Projected, and Fine-Tuned) to compare their performances not only with my evaluation metrics and in terms of dimensions (number of learnable parameters) but also in terms of inference time. I am following this example: https://it.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html. The architectures of my networks are as follows:
Original Net:
  1. 'input' Sequence Input Sequence input with 1 dimensions
  2. 'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
  3. 'batchnorm1' Batch Normalization Batch normalization with 10 channels
  4. 'relu1' ReLU ReLU
  5. 'gru1' GRU GRU with 32 hidden units
  6. 'output' Fully Connected 1 fully connected layer
Projected and Fine Tuned Net:
  1. 'input' Sequence Input Sequence input with 1 dimensions
  2. 'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
  3. 'batchnorm1' Batch Normalization Batch normalization with 10 channels
  4. 'relu1' ReLU ReLU
  5. 'gru1' Projected Layer Projected GRU with 32 hidden units
  6. 'output' Projected Layer Projected fully connected layer with output size 1
This is my code:
cfg = coder.config("mex");
cfg.TargetLang = "C++";
cfg.DeepLearningConfig = coder.DeepLearningConfig("none");
noisyInputType = coder.typeof('double', [Inf 1], [1 0]);
codegen -config cfg FinalFineTuned_predict -args {noisyInputType}
codegen -config cfg FinalProjected_predict -args {noisyInputType}
codegen -config cfg FinalOriginal_predict -args {noisyInputType}
Where the functions are:
function out = FinalOriginal_predict(in) %#codegen
% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused
% to call predict on inputs, thus avoiding reconstructing and reloading the
% network object.
% Copyright 2019-2021 The MathWorks, Inc.
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('1DCNN_LSTM07.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
2nd function:
function out = FinalProjected_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('FinalProjected_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
3rd function:
function out = FinalFineTuned_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('FinalFineTuned_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
I had to unpacked the projected layers in both the Projected and Fine Tuned networks, otherwise I had an error while compiling.
In all the cases, the error I am encountering now is: "For code generation of convolution1dLayer, when convolving over the time dimension ('T'), the 'T' dimension of the input must be fixed size." Can you help me?
Thank you in advance,
Silvia

Risposta accettata

Katja Mogalle
Katja Mogalle il 7 Nov 2024
What the error message ("the 'T' dimension of the input must be fixed size.") is trying to say is that C/C++ code generation of networks containing a convolution 1D layer is not supported if your sequences have variable length. All sequences in your inference data must always have the same number of time steps.
So, let's assume all your sequences have 100 time steps, then you need to specify the input data to the codegen command as follows:
noisyInputType = coder.typeof('double', [100 1], [false false]);
codegen -config cfg FinalOriginal_predict -args {noisyInputType}
If you indeed have variable length sequences, you'd have to cut off sequences either on the left or right side to make them all fixed length (or pad shorter sequences). If you want to do this in MATLAB, you can use the padsequences function.
Hope that helps.
  5 Commenti
Silvia
Silvia il 13 Nov 2024
I managed how to compute the inference time to compare my models.
What I see is that - counterintuitively - the fine-tuned model when compared to the original model results in an increase in the inference time calculated as in the example. What I have done differently is that, in passing the fine-tuned network to the coder, I had to take an extra step, namely to go and use the unpackProjectedLayers function in response to an error that would otherwise come up if I launched the codegen without unpacking the fine-tuned network. So I think it is the use of this function that causes an increase in inference time. Is this possible or is it unrelated?
Thank you as usual for your availability!
Silvia
Katja Mogalle
Katja Mogalle il 14 Nov 2024
I am certain the act of unpacking the projected layers is not the issue. Just make sure to save the "unpacked" network to the mat-file before re-generating the C/C+ code.
However, I do find inference speed is a tricky thing to measure and to understand. First, we need to make sure we have reliable measurements. This documentation example shows how to use timeit to measure inference speed and compare the original against the projected network: https://www.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html#CompressNetworkForEstimatingBatteryStateOfChargeExample-13
Just to double, check, you are running the generated code on the CPU, not a GPU, correct? And you are not using any third-party deep learning libraries for codegen?
Would you be able to share your inference measurements (original network vs. projected network)?
The next thing we can look at is how much each layer was compressed using the projection technique. If a layer was not compressed very much, it can have a negative impact on inference speed as there are overheads associated with projection. If you are using MATLAB R2024a or newer, you can use the analyzeNetwork function to analyze the projected network (before unpacking). If you see any small values (or even negative) in the "Learnables Reduction" column of the layer analysis table, you should consider not projecting those layers (by utilizing the LayerNames argument in the compressNetworkUsingProjection function).

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Deep Learning with GPU Coder in Help Center e File Exchange

Prodotti


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by