Training error when use selfAttentionLayer with DropOut

I want to use selfAttentionLayer to construct the time series prediction model. However, when I use selfAttentionLayer with DropOut, the training process generates error. The error messages are shown as follows:
Error using max
Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'.
Error in nnet.internal.cnn.util.boundAwayFromZero (line 10)
x = max(x, eps(precision), 'includenan');
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in gpuArray/internal_softmaxBackward (line 13)
Z = nnet.internal.cnn.util.boundAwayFromZero(Z);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in nnet.internal.cnnhost.scaledDotProductAttentionBackward (line 23)
dU = internal_softmaxBackward(matlab.lang.internal.move(dW), W, 1);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in gpuArray/internal_attentionBackward (line 34)
[dQ, dK, dV] = nnet.internal.cnnhost.scaledDotProductAttentionBackward(dZ, Q, K, V, ...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.recording.operations.AttentionOp/backward (line 48)
[dQ,dK,dV] = internal_attentionBackward(dZ,Q,K,V,dataForBackward,M,op.Args{1:6});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.recording.RecordingArray/backwardPass (line 99)
grad = backwardTape(tm,{y},{initialAdjoint},x,retainData,false,0);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in dlarray/dlgradient (line 132)
[grad,isTracedGrad] = backwardPass(y,xc,pvpairs{:});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in EHSAPressureStatePrediction>modelLoss (line 223)
gradients = dlgradient(loss,net.Learnables);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.dlfeval (line 17)
[varargout{1:nargout}] = fun(x{:});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in deep.internal.dlfevalWithNestingCheck (line 19)
[varargout{1:nargout}] = deep.internal.dlfeval(fun,varargin{:});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in dlfeval (line 31)
[varargout{1:nargout}] = deep.internal.dlfevalWithNestingCheck(fun,varargin{:});
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The dlnetwork:
numIn = 5;
numOut = 2;
peDim = 6;
seqLen = 4096;
layers = [sequenceInputLayer(numIn,"Normalization","none","Name","Input","MinLength",seqLen)
positionEmbeddingLayer(peDim,seqLen)
selfAttentionLayer(5,10,"DropoutProbability",0.2)
convolution1dLayer(3,20,"DilationFactor",2,"Padding","causal")
layerNormalizationLayer
convolution1dLayer(5,25,"DilationFactor",4,"Padding","causal")
layerNormalizationLayer
convolution1dLayer(7,30,"DilationFactor",8,"Padding","causal")
fullyConnectedLayer(20)
reluLayer
fullyConnectedLayer(10)
reluLayer
fullyConnectedLayer(5)
reluLayer
fullyConnectedLayer(numOut,"Name","output")];
net = dlnetwork(layers);
% analyzeNetwork(net);
I want to know why the dropout in the selfAttentionLayer causes this error.

Risposte (0)

Categorie

Prodotti

Release

R2025a

Richiesto:

il 2 Apr 2026 alle 2:41

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by