Compatibility Matlab & GPU coder Compute Capability 8.6 RTX 3070
Mostra commenti meno recenti
Good morning,
I recently bought an RTX 3070 and was trying to make use of it by generating CUDA code via the GPU coder. The card works, but I have noticed two things. I have Matlab 2021a, the latest nvidia drivers, all the required programs for GPU coder to work (as explained in https://es.mathworks.com/help/gpucoder/gs/install-prerequisites.html ) and the "coder.checkGpuInstall" command shows the following (see attached .txt).
(i) When running GPU bench, the results seem to indicate that the single precision TFLOPS are about half of the cards theoretical value (please see figure enclosed). In contrast, other third party tools like CUDA-Z (also below) show that the card has about 22 TFLOPS. Does this mean that Matlab is currently using half the CUDA cores per SM? Am I missing something obvious?

Figure 1: Matlab's GPU bench results

Figure 2: Cuda-Z results
(ii) I was trying to profile the GPU coder generated code by following the steps in https://es.mathworks.com/help/gpucoder/ug/gpucoder-execution-profiling-report.html (in fact, by running the code in "C:\ (...) \Documents\MATLAB\Examples\R2021a\gpucoder\GPUExecutionProfilingOfTheGeneratedCodeExample") and I am getting the following error message:
"
Error using gpucoder.profile (line 41)
Error setting property 'ComputeCapability' of class 'GpuConfig': Invalid value '8.6'.
Allowed values are:
3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2, 7.5, 8.0
"
Would this mean again that the compute capability 8.6 is yet not supported?
I have tried downloading the Matlab 2021b prerelease but unfortuntaely it does not install properly (the files for matlab are in the directory but the launcher does not appear anywhere. When I launch the .exe within the files I get an error (unfortunately I don't have it now to show you)).
Thank you in advance for your help, I hope my question was clear and concise. This is my first question so feedback on how to improve is very welcome.
Best,
Risposta accettata
Più risposte (1)
Joss Knight
il 22 Lug 2021
0 voti
Regarding the gpuBench results: no, MATLAB is definitely not only using half the cores! What you are seeing is the raw performance of SGEMM in NVIDIA's cublas library in CUDA 11.0. My understanding is that on compute capability 8.6 devices, cublas is still undergoing considerable optimisation; and indeed we see that confirmed with some improvements when upgrading to CUDA 11.2 (for which you'll have to wait until next year).
However, the performance of MTIMES still does not reach the theoretical maximum and perhaps it never will. If you click on your result for single precision MTIMES in the gpuBench report it will take you to the graph and you'll see that the performance peaks and flattens out at a certain matrix size. It may be that on these devices memory bandwidth starts to become more of a bottleneck for larger sizes. In CUDA-Z the benchmark no doubt simply runs floating point operations inside a kernel without any input or output data at all. This is great for testing raw compute power, not so useful for working out how fast the card is at doing something genuinely useful.
We're going to continue investigating this to see if we can get any more information on why cublas performance isn't as good as expected for these cards, and whether there is anything you can do.
1 Commento
Marco Irisarri
il 5 Ago 2021
Categorie
Scopri di più su Get Started with GPU Coder in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!