Error in LSTM layer architecture

Question

Chuck Noise il 14 Set 2023

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2020836-error-in-lstm-layer-architecture

Risposto: Ben il 18 Set 2023

Hi

I am playing around with a LSTM network and trying out different things. I am an amateur regarding this and I am just figuring things out as I go along.

I have created a regression network that takes in a sequence and tries to estimate certain parameters based on the sequence.

Not being able to get a decent network running with just one layer I have implemented a second lstmLayer, and this is where I have some troubles.

When I do a sequence-to-sequence regressions network with these settings:

layers = [

sequenceInputLayer(3501)

lstmLayer(12, 'OutputMode', 'sequence')

fullyConnectedLayer(1)

regressionLayer

];

options = trainingOptions('adam', ...

'MaxEpochs',150, ...

'GradientThreshold',1, ...

'InitialLearnRate',0.01, ...

'ValidationData',{XVal, YVal}, ...

'ValidationFrequency',5, ...

'Verbose',1, ...

'Plots', 'training-progress');

I get a trainable network, but when I try changing the second lstmLayer to just output the last time step of the sequence, i.e.

layers = [

sequenceInputLayer(3501)

lstmLayer(12, 'OutputMode', 'sequence')

lstmLayer(12, 'OutputMode', 'last')

fullyConnectedLayer(1)

regressionLayer

];

I get the error:

"Error using trainNetwork

Invalid training data. For regression tasks, responses must be a vector, a matrix, or a 4-D array of real numeric responses.

Responses must not contain NaNs."

I guess it has something to do with the output of the second lstmLayer vs. the input of the fullyConnectedLayer, and I've been reading documentation for the last hour, but I simply can't figure out why there should be a problem. Can anyone enlighten me here?

Thanks

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Ben il 18 Set 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2020836-error-in-lstm-layer-architecture#answer_1312802

It looks like the issue is the data you pass to trainNetwork. When you swap the 2nd lstmLayer to have OutputMode="last" then the network only outputs that last LSTM state, the fullyConnectedLayer only operates on that last state, and the loss computed in regressionLayer only uses that last state, and the "actual targets" passed to trainNetwork.

Here's an example:

sequenceLength = 5;

x = {randn(1,sequenceLength)};

y = x+1;

layers = [

sequenceInputLayer(1)

lstmLayer(1)

regressionLayer];

opts = trainingOptions("adam");

net = trainNetwork(x,y,layers,opts);

This LSTM takes in a sequence, outputs a sequence, and trainNetwork trains the network by minimizing the loss between the network output and the target data y. We usually call this sequence-to-sequence.

Now if you instead just want a sequence-to-one case you can do something like:

sequenceLength = 5;

x = {randn(1,sequenceLength)};

y = sum(x);

layers = [

sequenceInputLayer(1)

lstmLayer(1,OutputMode="last")

regressionLayer];

opts = trainingOptions("adam");

net = trainNetwork(x,y,layers,opts);

The various supported response types for trainNetwork are described here https://uk.mathworks.com/help/deeplearning/ref/trainnetwork.html?s_tid=doc_ta#mw_d0b3a2e4-09a0-42f9-a273-2bb25956fe66

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Error in LSTM layer architecture

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

Error in LSTM layer architecture

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti