The gradient of mini batches

Question

MAHSA YOUSEFI il 23 Nov 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches

Commentato: Mahesh Taparia il 21 Dic 2020

Risposta accettata: Mahesh Taparia

Apri in MATLAB Online

Hi there.

I need your confimation or rejection for this question...

In following code, if the minibatch size is h,

[grad,loss] = dlfeval(@modelGradients,dlnet,dlX_miniBatch,Y_miniBatch);

the grad is the average of gradients of loss over h samples? Does it calculate dradients automatically and at the end with:

grad = 1/h * sum_i=1:h (\nabla loss(y_i,yHat_i)) ??

Following this question, for computing the total loss and geadient (for a full batch), does we should take avarage of losses and averages of gradients (averaging with the number of batches, say 1000 batches each with h size)??

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Mahesh Taparia il 14 Dic 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/658543-the-gradient-of-mini-batches#answer_575280

Hi

The function dlfeval evaluate the custom deep learning models. The loss are calculated based on what has been defined in modelGradients function. So if you are calculating the average loss in this function, then it will return the averaged one. For example, consider this modelGradient function, it is calculating the average cross entropy loss, so it will return the average loss. The gradients are calculated with respect to the loss function defined in for the network.

2 Commenti
Mostra NessunoNascondi Nessuno

MAHSA YOUSEFI il 19 Dic 2020

Apri in MATLAB Online

In the example you mentioned, there is a mistake.

function [gradients, loss] = modelGradients(parameters, dlX, T)
    % Forward data through the model function.
    dlY = model(parameters,dlX);
    % Compute loss.
    loss = crossentropy(dlX,T);
    % Compute gradients.
    gradients = dlgradient(loss,parameters);
end

dlY must be feed to crossentropy!

Mahesh Taparia il 21 Dic 2020

Yeah, crossentropy loss will be calculated between dlY and T. The documentation page will be updated.

Accedi per commentare.

The gradient of mini batches

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

The gradient of mini batches

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno