predict function giving different output when input sequences in a loop

23 visualizzazioni (ultimi 30 giorni)
Hi all, I have trained a bi-lstm sequence to sequence network and used it to test the performance. Specifically, the input is of size 20*1 and output is of 1*1. When I use the predict function, if all the input sequences are used as a big array (eg. 20*1e5), the performance is great, but if I go by a loop, the performance is degraded. The code for the loop is:
onet_tst = [];
for i = 1:1e5
[onet_,state] = predict(dlnet,inp(:,i));
onet_tst = [onet_tst, onet_];
dlnet.State = state;
end
So may I know:
1. Does that mean the predict function uses other input sequences to refine the output? I.e., for output i, it uses information from other input sequences?
2. Does this problem affected by the traning parameters selected? Say, the minibatch size?
3. Is there is method/parameters that can make the two ways of using the predict function give me the equavalent performance?
Thank you all in advance. Attached is the plot for the comparison.

Risposte (1)

Sebastian
Sebastian il 23 Feb 2026 alle 9:51
Hi there,
I think I see what's happening here.
This is actually expected behavior for stateful LSTMs, but it's subtle and trips up a lot of people.
TL;DR: When you pass all sequences as one big array, MATLAB processes them as independent observations (resets state between them). In your loop, you're carrying over state from sequence i to sequence i + 1, which creates a dependency chain that shouldn't exist if your sequences are independent.
What's happening:
  1. Array input (predict(dlnet, inp)): MATLAB treats each column as an independent observation. The network state is automatically reset between observations, so sequence 5 doesn't "see" what happened in sequence 4. This is the correct behavior for independent test sequences.
  2. Your loop: By doing dlnet.State = state after each prediction, you're explicitly carrying over the LSTM's hidden/cell state to the next sequence. This means sequence i+1 starts with the "memory" of sequence i. If your sequences are supposed to be independent (which they usually are in train/test splits), this creates artificial temporal dependencies that degrade performance.
The fix:
If your sequences are independent (which seems to be the case since you mentioned "test performance"), don't update the state in the loop:
onet_tst = zeros(1, 1e5); % Pre-allocate for speed
for i = 1:1e5
onet_ = predict(dlnet, inp(:,i)); % No state update
onet_tst(i) = extractdata(onet_);
% Don't touch dlnet.State!
end
Or better yet, just use the array method which is vectorized and faster:
onet_tst = predict(dlnet, inp); % This is the correct way for independent sequences
To answer your specific questions:
  1. Does predict use other sequences to refine output? No, not in the way you think. The array method processes each sequence independently (correct). The loop method creates artificial dependencies by carrying state over (incorrect for independent data).
  2. Affected by training parameters? Mini-batch size during training affects how gradients are computed, but the key issue here is state management during inference, not training parameters.
  3. How to make them equivalent? Remove the state update from your loop, or use resetState before each prediction if you truly need loop processing:
for i = 1:1e5
dlnet = resetState(dlnet); % Reset before each sequence
onet_ = predict(dlnet, inp(:,i));
...
end
Why the plot shows "loop" lagging behind: The yellow curve in your plot shows the classic signature of state "bleeding" between sequences - the predictions are smoothed/delayed because the LSTM is trying to maintain temporal continuity where none exists.
When would you want state updates? Only if your 100k sequences are actually one long continuous time series that you artificially chopped into columns. In that case, the loop with state update is correct, but then you'd also want state persistence during training (stateful training), which is a whole different setup.
Hope this clears it up! The array method is giving you the correct independent predictions.
Best, Seba
  1 Commento
Yichang
Yichang circa 10 ore fa
Dear Sebastian,
Thank you very much for the answers, now I see the reason of lagging. But I have actually tried to not update the state before, the results is not good, as attached. The lagging do become better, but the magnitude is smaller and there is an offset.
And sorry for the confusion, allow me to clarify a bit. The inputs are actually from time series, with a moving window to crop the latest 20 values. But when being processed as if they are independent, would the "big array" and "loop" results be consistent? Thank you.
Regards,
Yichang

Accedi per commentare.

Categorie

Scopri di più su Sequence and Numeric Feature Data Workflows in Help Center e File Exchange

Prodotti


Release

R2025b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by