Azzera filtri
Azzera filtri

calculate predictions with weights and bias which extracted from LSTM model

35 visualizzazioni (ultimi 30 giorni)
Hello,
I'am trying to calculate outputs using parameters from LSTM model (recurrentweight, inputweight, bias)
but output is different between "below codes" and "output from code Y=predict(net,X)".
please help me if you know the problems.
Thank you.
My network structure: (simple network)
layers = [
sequenceInputLayer(9,"Normalization","none") % number of input parameters are 9
lstmLayer(256)
fullyConnectedLayer(1)];
options = trainingOptions("adam", ...
MaxEpochs=2000, ...
SequencePaddingDirection="left", ...
Shuffle="every-epoch", ...
Plots="training-progress", ...
Verbose=false);
net = trainnet(X,Y,layers,"mse",options);
My code to extract the weights and bias:
R=net.Layers(2,1).RecurrentWeights;
W=net.Layers(2,1).InputWeights;
b=net.Layers(2,1).Bias;
Fc_W=net.Layers(3,1).Weights;
Fc_B=net.Layers(3,1).Bias;
Code for extract parameters of LSTM Layer (input, forget, cell, output)
HiddenLayersNum = 256;
W.Wi=W(1:HiddenLayersNum,:);
W.Wf=W(HiddenLayersNum+1:2*HiddenLayersNum,:);
W.Wc=W(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
W.Wo=W(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
R.Ri=R(1:HiddenLayersNum,:);
R.Rf=R(HiddenLayersNum+1:2*HiddenLayersNum,:);
R.Rc=R(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
R.Ro=R(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
B.bi=b(1,:);
B.bf=b(HiddenLayersNum+1:2*HiddenLayersNum,:);
B.bc=b(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
B.bo=b(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
h=net.State.Value{1,1}; % Hiddenstate
c=net.State.Value{2,1}; % Cellstate
Code for calculate LSTM Layer output:
% Input Gate
Z = W.Wi*X+R.Ri*h+B.bi; % x is new intput value for prediction (ex: x=[1 5 20 1 2];)
I = 1.0 ./ (1.0 + exp(-Z)); % Input gate
% Forget Gate
f =W.Wf*X+R.Rf*h+B.bf;
F = 1.0 ./ (1.0 + exp(-f)); % Forget gate
% Layer Input
g=W.Wc*X+R.Rc*h+B.bc; % Layer input
G=tanh(g);
% Output Layer
output = W.Wo*X+R.Ro*h_prev+B.bo;
output = 1.0 ./ (1.0 + exp(-output)); % Output Gate
% Cell State
cellgate=F.*c+I.*G; % Cell Gate
cellgate=cellgate;
% Output (Hidden) State
hidden=O.*tanh(cellgate); % Output State
hidden=dlarray(hidden);
L1 = relu(hidden);
Code for calculate output in fullyconnected Layer:
Fc=Fc_W*L1+Fc_B

Risposta accettata

Paras Gupta
Paras Gupta il 15 Lug 2024 alle 19:46
Hi James,
I understand that the provided code for your model, which includes "LSTM" and a "Fully Connected" layers is giving incorrect inference results than the trained model's "predict" function.
The provided code seems to have the following three issues:
  • Incorrect Bias Initialization for Input Gate - The bias for the input gate shoud index the first "HiddenLayersNum" number of elements
  • Incorrect Variable Used for Output State Calculation - The variable "ouput" should be used instead of "O"
  • Incorrect Computation and Usage of "L1" - The computation of "L1" variable using "relu" function and its subsequent usage is incorrect.
You can refer to the following modfied code to obtain the correct results:
HiddenLayersNum = 256;
W.Wi=W(1:HiddenLayersNum,:);
W.Wf=W(HiddenLayersNum+1:2*HiddenLayersNum,:);
W.Wc=W(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
W.Wo=W(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
R.Ri=R(1:HiddenLayersNum,:);
R.Rf=R(HiddenLayersNum+1:2*HiddenLayersNum,:);
R.Rc=R(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
R.Ro=R(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
B.bi=b(1:HiddenLayersNum,:); % corrected
B.bf=b(HiddenLayersNum+1:2*HiddenLayersNum,:);
B.bc=b(2*HiddenLayersNum+1:3*HiddenLayersNum,:);
B.bo=b(3*HiddenLayersNum+1:4*HiddenLayersNum,:);
h=net.State.Value{1,1}; % Hiddenstate
c=net.State.Value{2,1}; % Cellstate
% Input Gate
Z = W.Wi*X+R.Ri*h+B.bi; % x is new intput value for prediction (ex: x=[1 5 20 1 2];)
I = 1.0 ./ (1.0 + exp(-Z)); % Input gate
% Forget Gate
f =W.Wf*X+R.Rf*h+B.bf;
F = 1.0 ./ (1.0 + exp(-f)); % Forget gate
% Layer Input
g=W.Wc*X+R.Rc*h+B.bc; % Layer input
G=tanh(g);
% Output Layer
output = W.Wo*X+R.Ro*h_prev+B.bo;
output = 1.0 ./ (1.0 + exp(-output)); % Output Gate
% Cell State
cellgate=F.*c+I.*G; % Cell Gate
cellgate=cellgate;
% Output (Hidden) State
hidden=output.*tanh(cellgate); % corrected
hidden=dlarray(hidden);
% removed L1 computation as it is not required
Fc=Fc_W*hidden+Fc_B % corrected
The following documentation links might be helpful:
Hope this helps.
  1 Commento
James
James il 19 Lug 2024 alle 12:04
Thank you for your help!
I tried your code and the result showed same value as the value of "Y=predict(net,X)"
I really appreciate!

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by