Customized Regression output layer

Hello everyone, following the example of https://it.mathworks.com/help/deeplearning/ug/define-custom-regression-output-layer.html I tried to build a regression output layer using the mse error. Using the provided script, I did this for the loss function:
function loss = forwardLoss(layer, Y, T)
loss = mse(Y,T);
end
But trying with a data set in matlab net = trainNetwork(bodyfatInputs,bodyfatTargets,layers,options);
it gave me
Error using trainNetwork (line 170)
Error using 'forwardLoss' in Layer mseRegressionLayer. The function threw an error and could not be executed.
I buit layers
layers = [
sequenceInputLayer(13)
lstmLayer(100)
fullyConnectedLayer(1)
mseRegressionLayer('mse')];
What did I do wrong?
Thanks for your help

7 Commenti

Did you test your custom layer with checkLayer function? What version of MATLAB are you using ?
The version is 2020a. As for checklayer, It's not really clear to me what should I pass as validInputsize: I am using bodyfat_dataset with input 13x252 and targets 1x252, the last layer should take only one input from fullyconnectedlayer(1) if I understood correctly.
thanks
For a sequence to one regression, documentation for check layer define it as Sequence-to-one regression R-by-N, where R is the number of responses and N is the number of observations.
Yes in your case R would be 1.
Thanks for the help. It gives me this, but I don't understand why mse(Y,T) should not be scalar:
Skipping GPU tests. No compatible GPU device found.
Running nnet.checklayer.TestOutputLayerWithoutBackward
...
================================================================================
Verification failed in nnet.checklayer.TestOutputLayerWithoutBackward/forwardLossIsScalar(Observations=multiple).
----------------
Test Diagnostic:
----------------
Incorrect size of 'loss' for 'forwardLoss'. The loss must be scalar.
---------------------
Framework Diagnostic:
---------------------
verifySize failed.
--> The value had an incorrect size.
Actual Size:
1 2
Expected Size:
1 1
Actual Value:
1×2 single dlarray
2.9926758 3.3427734
------------------
Stack Information:
------------------
In C:\Program Files\MATLAB\R2020a\toolbox\nnet\cnn\+nnet\+checklayer\TestOutputLayerWithoutBackward.m (TestOutputLayerWithoutBackward.forwardLossIsScalar) at 33
================================================================================
....
================================================================================
Verification failed in nnet.checklayer.TestOutputLayerWithoutBackward/backwardPropagationDoesNotError(Observations=multiple).
----------------
Test Diagnostic:
----------------
Unable to backward propagate through the layer. Check that the 'forwardLoss' function fully supports automatic differentiation. Alternatively, implement the 'backwardLoss' function manually.
---------------------
Framework Diagnostic:
---------------------
Error using dlfeval (line 43)
Value to differentiate must be a traced dlarray scalar.
------------------
Stack Information:
------------------
In C:\Program Files\MATLAB\R2020a\toolbox\nnet\cnn\+nnet\+checklayer\TestOutputLayerWithoutBackward.m (TestOutputLayerWithoutBackward.backwardPropagationDoesNotError) at 69
================================================================================
.
Done nnet.checklayer.TestOutputLayerWithoutBackward
__________
Failure Summary:
Name Failed Incomplete Reason(s)
====================================================================================================================================================
nnet.checklayer.TestOutputLayerWithoutBackward/forwardLossIsScalar(Observations=multiple) X Failed by verification.
----------------------------------------------------------------------------------------------------------------------------------------------------
nnet.checklayer.TestOutputLayerWithoutBackward/backwardPropagationDoesNotError(Observations=multiple) X Failed by verification.
Test Summary:
6 Passed, 2 Failed, 0 Incomplete, 2 Skipped.
Time elapsed: 0.31461 seconds.
I may have done wrong sorry. I tried like this now:
validInputSize = [ 1 13];
>> checkLayer(layer,validInputSize,'ObservationDimension',2);
Skipping GPU tests. No compatible GPU device found.
Running nnet.checklayer.TestOutputLayerWithoutBackward
........
Done nnet.checklayer.TestOutputLayerWithoutBackward
__________
Test Summary:
8 Passed, 0 Failed, 0 Incomplete, 2 Skipped.
Time elapsed: 0.058233 seconds.
Mohammad Sami
Mohammad Sami il 7 Set 2020
Modificato: Mohammad Sami il 7 Set 2020
It seems the layer should be valid. Maybe something else is wrong. Try using the built-in regression layer ( which also uses mse) to verify that there is nothing else wrong.

Accedi per commentare.

Risposte (1)

Hi,
I tried to implement your network on my end and found two problems. One, when using "mse" as the loss function, it is advisable to mention the 'DataFormat' argument as well, for example, see this page. So, modify the line (in your definition of 'mseRegressionLayer.m')
loss = mse(Y,T);
%change to
loss = mse(Y,T,'DataFormat','T'); %for sequences
Coming to the problem you are trying to solve:
The "bodyfat_dataset" consists of two important vectors X and T where X is of size 13 - by - 252 and targets T is 1 - by - 252. From my understanding, you would like to create a LSTM network which accepts a sequence of 13 features and predicts the body fat percentage. This is a sequence to one regression problem and as advised here, I redesigned your network as such:
layers = [
sequenceInputLayer(13)
lstmLayer(100,'OutputMode',"last") % Output the last time step of the sequence
fullyConnectedLayer(1)
mseRegressionLayer('mse')];
However, in this output mode the input must be in cell array format. To do this you may use the following:
N = 240; %number of sequences
cellArrTrain = cell(N,1);
for i = 1:N
seq = xtrain(:,i);
seq = num2cell(seq,1);
cellArrTrain(i) = seq;
end
% ------ FOR TRAINING PURPOSES -------%
net = trainNetwork(cellArrTrain,Ytrain,layers,options); %be cautious of the dimensions of
% cellArrTrain and Ytrain, they should match.
% Similarly convert the test data into a cell array too
Hope this helps!

5 Commenti

Thanks for your help!
I tried to use your code... Why is N=240? Shouldn't it be 252 since BodyfatTargets is 1x252?
Also trying with
net = trainNetwork(cellArrTrain,bodyfatTargets,layers,options);
gives me the error
Error using trainNetwork (line 170)
Invalid training data. X and Y must have the same number of observations.
Then I tried making bodyfatTargerts a cell array too, doing
cellArrTargets = cell(N,1);
for i = 1:N
seq = bodyfatTargets(i);
seq = num2cell(seq,1);
cellArrTargets(i) = seq;
end
net = trainNetwork(cellArrTrain,cellArrTargets,layers,options);
but still error
Error using trainNetwork (line 170)
Invalid training data. For a recurrent layer with output mode 'last',
responses must be a matrix of numeric responses.
What do you suggest?
Thanks again!
Hi Fabrizio,
I guess I left out some details in my answer last time. The "N = 240" is nothing but the training data points i.e. I had taken the first 240 columns from X as the training data and then created cellArrTrain which is a 240 x 1 cell array and each element of it is a 13x1 column vector. The "targets" must also be in a column vector format, so the "bodyFatTargets" variable must be 240 x 1 column vector. In my case, I just took:
xtrain = X;
%create the cellArrTrain from the given method
targets = T(1:240)';
et = trainNetwork(cellArrTrain,targets,layers,options);
If you need help with training options as well, check this page to get some ideas.
Hope this helps!
Thank you again. Trying like this it doesn't give me errors anymore indeed. The only problem is the result:
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | RMSE | Loss | Rate |
|========================================================================================|
| 1 | 1 | 00:00:00 | 21.29 | 29006.1 | 0.0100 |
| 30 | 30 | 00:00:00 | NaN | NaN | 0.0100 |
I don't know if it may depends by the choice of layers or training options... My full code is this one (with implementation of mseRegressionLayer as you suggested):
layer = mseRegressionLayer('mse');
load bodyfat_dataset % dataset with bodyfatInputs and bodyfatTargets
layers = [
sequenceInputLayer(13)
lstmLayer(100,'OutputMode',"last") % Output the last time step of the sequence
fullyConnectedLayer(1)
mseRegressionLayer('mse')];
options = trainingOptions('sgdm');
N = 240; %number of sequences
cellArrTrain = cell(N,1);
for i = 1:N
seq = bodyfatInputs(:,i);
seq = num2cell(seq,1);
cellArrTrain(i) = seq;
end
net = trainNetwork(cellArrTrain,bodyfatTargets(1:240)',layers,options);
YPred = predict(net,bodyfatInputs);
predictionError = YPred - bodyfatTargets;
Looks like the regression loss is too high. Try normalizing the loss in the custom layer like this:
loss= mse(Y,T,'DataFormat','T')/size(Y,2); %for the mini - batch
Also, start with a smaller learning rate around 0.001. Example of a standard training Option:
options = trainingOptions('adam', ...
'MaxEpochs',500, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.001, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',100, ...
'LearnRateDropFactor',0.1,...
'Verbose',1, ...
'Plots','training-progress');
You can play around with the number of layers and LSTM nodes, learning rates and number of epochs. Also, it is advisable to used validation sets because overfitting is quite common as we increase the number of layers.
Now the loss seems more fair:
Training on single CPU.
|========================================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | RMSE | Loss | Rate |
|========================================================================================|
| 1 | 1 | 00:00:01 | 20.06 | 201.3 | 0.0010 |
| 50 | 50 | 00:00:02 | 15.77 | 124.4 | 0.0010 |
| 100 | 100 | 00:00:04 | 13.13 | 86.2 | 0.0010 |
| 150 | 150 | 00:00:05 | 12.89 | 83.1 | 0.0001 |
| 200 | 200 | 00:00:06 | 12.55 | 78.7 | 0.0001 |
| 250 | 250 | 00:00:07 | 12.51 | 78.2 | 1.0000e-05 |
| 300 | 300 | 00:00:09 | 12.48 | 77.9 | 1.0000e-05 |
| 350 | 350 | 00:00:11 | 12.48 | 77.9 | 1.0000e-06 |
| 400 | 400 | 00:00:12 | 12.48 | 77.9 | 1.0000e-06 |
| 450 | 450 | 00:00:13 | 12.48 | 77.8 | 1.0000e-07 |
| 500 | 500 | 00:00:14 | 12.48 | 77.8 | 1.0000e-07 |
|========================================================================================|
However , using
YPred = predict(net,bodyfatInputs);
The predictions are basically all the same... In general I saw that changing some parts of the networks, Ypred changes but again all the values are basically the same, definitely not in accord with the Targets... What could be the cause of it?
Again, thank you for all the time you are giving me :)

Accedi per commentare.

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by