Why do I see a drop (or jump) in my final validation accuracy when training a deep learning network?

Question

1 voto

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

MathWorks Support Team il 19 Feb 2019

2 voti

If the network contains batch normalization layers, the final validation metrics are often different from the validation metrics evaluated during training. This is because the network undergoes a 'finalization' step after the last iteration to compute the batch normalization layer statistics on the entire training data, while during training the batch normalization statistics are computed from the mini-batches.

If in addition to batch normalization layers the network contains dropout layers, the interaction between these two layers can aggravate this issue, as described here: https://arxiv.org/abs/1801.05134

If one removes the batch normalization (and dropout) layers from the network, the 'final' accuracy should be the same as the last iteration accuracy.

Increasing the size of the mini-batches can also alleviate this issue, since the statistics from a larger mini-batch may be better estimates of the entire training data statistics.

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Why do I see a drop (or jump) in my final validation accuracy when training a deep learning network?

Risposta accettata

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Più risposte (0)

Categorie

Tag

Community Treasure Hunt

Why do I see a drop (or jump) in my final validation accuracy when training a deep learning network?

Risposta accettata

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Più risposte (0)

Categorie

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti