Out of memory issue on evaluating CNNs

63 views (last 30 days)
The message "Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'." appears when I try to evaluate my trained CNN. I'm using a GeForce 1060M GTX 6GB RAM.
Here's a piece of my code:
testData = load('testROI.mat');
[test_imds, test_pxds] = pixelLabelTrainingData(testData.gTruth);
testDataSet = pixelLabelImageDatastore(test_imds, test_pxds);
unetPxdsTruth = testDataSet.PixelLabelData;
unetpxdsResults = semanticseg(test_imds, unet); % error is caused by this line
unetMetrics = evaluateSemanticSegmentation(unetpxdsResults, unetPxdsTruth);
The command gpuDevice() shows the results below:
CUDADevice with properties:
Name: 'GeForce GTX 1060'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 9.2000
ToolkitVersion: 9.1000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
AvailableMemory: 5.0524e+09
MultiprocessorCount: 10
ClockRateKHz: 1670500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
As you can see, there are more than 5GB of free memoy but, for some reason I don't understand, the out of memory problem happens. The curious thing is it doesn't happen with 500 images the training stage, but happens with 100 images in the test evaluating stage. It's important to emphasize that this evaluation atempt uses a pretrained CNN that I created in another moment, so the training data is not in the GPU memory while doing this.
Does anyone please knows what might be going on?

Accepted Answer

Andrea Picciau
Andrea Picciau on 21 Jun 2019
Edited: Andrea Picciau on 21 Jun 2019
The problem is that your GPU's 6GB memory is not enough to execute the semantic segmentation with the default settings. Reducing the mini-batch size from the default 126 should be enough in your case. Try changing the problematic line to the following:
unetpxdsResults = semanticseg(test_imds, unet, 'MiniBatchSize', 4);
You can try increasing that 4 to a larger value, but it wouldn't surprise me if 8 was the maximum your GPU could get to with your GPU.
You should also have a look at semanticseg's doc page and the name-value pairs in particular.
A last note on gpuDevice: when you get the out-of-memory error, MATLAB doesn't allocate the data. This is approximately what happens:
  • At rest, something between a few hundred MBs to a GB is allocated on your GPU memory. This is the space occupied by the CUDA libraries.
  • When you are running sematicseg with the default settings, some of the data MATLAB needs to allocate is way larger than your GPU's remaining 5GB memory.
  • MATLAB asks CUDA to allocate that data,
  • CUDA gives an error, saying that your GPU's memory is not large enough,
  • MATLAB informs you with the error message and doesn't allocate anything,
  • When you check with gpuDevice, you see 5GB are free.
  4 Comments
ioannisbaptista
ioannisbaptista on 21 Jun 2019
Got it, Andrea.
Thank you so much! Regards.

Sign in to comment.

More Answers (1)

silver tena
silver tena on 23 Aug 2021
Hello Andrea and ioannisbaptista.
I have a problem when I am training the program for image classification. The error message is "out of memory". My GPU specification bellow.
Name: 'GeForce GTX 1650'
Index: 1
ComputeCapability: '7.5'
SupportsDouble: 1
DriverVersion: 11.1000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 4.2950e+09
AvailableMemory: 3.3381e+09
MultiprocessorCount: 14
ClockRateKHz: 1710000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1
I try to training for alexnet, it's not problem. When I training for VGG16, the problem occured as "out of memory".
This part is where the error occurs. I am set teh minibatch size=5. epoch =10. Amount of pictures are 300 only.
featuresTrain = activations(netTransfer,augimdsTrain,layer,'OutputAs','rows');
I need your help.
I still have to train also the Resnet 110 and iceptionV3 architecture. I need the help of both of you. God bless all fo you.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by