Why peepholeLSTMLayer implemented in a tutorial is much slower than built-in lstmlayer?

Question

Artem Lensky il 31 Ago 2023

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2015331-why-peepholelstmlayer-implemented-in-a-tutorial-is-much-slower-than-built-in-lstmlayer

Modificato: Artem Lensky il 1 Set 2023

Why this implementation of peepholeLSTMLayer https://au.mathworks.com/help/deeplearning/ug/define-custom-recurrent-deep-learning-layer.html is much slower than built-in lstmlayer?

What can be done to speed it up? For example, can it be compiled into a binary code?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Hiro Yoshino il 31 Ago 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2015331-why-peepholelstmlayer-implemented-in-a-tutorial-is-much-slower-than-built-in-lstmlayer#answer_1298581

I suppose that is because the implementation of interest is a custom model while the built-in LSTM is optimized for computation.

MATLAB has kept improving its performance over the years (see this). So I guess this is also the case with the buil-in capabilities in MATLAB.

As for speeding up, you may choose a CPU for computation of LSTM (See Tips).

You can also see this to speed up your custom trainings.

Hope these help you.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Artem Lensky il 1 Set 2023

Modificato: Artem Lensky il 1 Set 2023

Hi Hiro,

Thanks for the prompt reply. Yep, I train GRU/LSTM and PeepholeLSTM on a CPU. Peephole is not just slower, it's slower by a factor of 100 compared to standard LSTM. Luckily, this time I don't use custom training loops, it is trained by builtin train function. The model is extremly simple e.g. 1 layer with 8 PeepholeLSTM units. The dimensions of the input signals is 5 by (25k~30k).

1 'sequenceInputLayer' Sequence Input Sequence input with 5 dimensions

2 'rnn_1' peepholeLSTMLayer Peephole LSTM with 8 hidden units

3 'fc' Fully Connected 3 fully connected layer

4 'softmax' Softmax softmax

5 'classoutput' Classification Output crossentropyex

I ran the profiler (just 13 iteration of training) and see below what I've got. Any ideas how I can speed it up? Perhaps updating the tutorial code or compiling it to binary code e.g. mex. There must be something, it is just too slow. Thanks again!

Accedi per commentare.

Why peepholeLSTMLayer implemented in a tutorial is much slower than built-in lstmlayer?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Why peepholeLSTMLayer implemented in a tutorial is much slower than built-in lstmlayer?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti