How to run a MEX function asynchronous in Matlab?

Hi,
I have developed a MEX function (in CUDA) which uses a GPU for processing the data transfered from a device (let's call is D1) to my PC. D1 uses Matlab to communicate with my PC. In a sequential way, once the data is transferred from D1 to my PC, the MEX function starts to process. The transfer time from D1 to PC takes about 0.3 s. The MEX function takes about 0.4 s to finish the processing. So, it total I need 0.7 s to get the ouput of the MEX function.
Now, I've been thinking that maybe I could run the MEX function asynchronous to data transfer from D1 to PC; This means that in parallel of processing in the MEX function, I want to transfer the data from D1 to Matlab. In this way, I do not have to wait for 0.3 s (of data transfer) because it is done during processing of the MEX function. Could you please let me know if this is possible in Matlab or not. And if yes, how?
Thanks in advance.
Moein.

5 Commenti

Sure, you can do whatever you like inside your MEX function, including getting data from a device and processing it using CUDA libraries or kernels. If you allocate your GPU memory using MATLAB's mxGPU APIs (such as mxGPUCreateGPUArray) you will be able to take advantage of MATLAB's memory pooling to avoid the synchronization that happens when you call the CUDA memory allocation routines.
However, what I don't understand is how you can benefit from asynchronous execution here? Surely you need to wait for your data to be transferred from D1 to PC before you can process it?
Thank you Joss.
Unfortunately, when I use Matlab mxGPU APIs, my code does not work stably. I have been looking for the problem, but I could not find any, and everything is find. So, i just stick to the cuda commands to copy data from PC to GPU and the other way around. I also transfer data from PC to the GPU in an asynchronous way to save time.
Yes, you are right. I have to wait for the FIRST dataset to be transferred from D1 to the PC before I can process it. However, I want to transfer the SECOND dataset from D1 to PC while I'm processing the FIRST one in the GPU. In this way, I do not have to wait for 0.3 s (to transfer the SECOND dataset) when processing of the FIRST dataset is done. I hope this explanation helps.
Moein.
Cool. In which case, go for it. Start asynchronously copying data from the device into an output mxArray and then process the input mxArray with your CUDA kernel. I presume you have some sort of asynchronous API to do this, or else you can launch a thread to do it and rejoin it before returning.
Joss,
Maybe I was not clear about my problem. The problem is not about "asynchronously copying inout and output of the Kernel". The problem is "How can I run the MEX function (we do not care what is inside this function) asynchronouse to the data transfer from my imaging device (called D1) to Matlab".
Please see the following drawings. I need to know how to have the P2 pattern in Matab.
Let's see this Matlab code:
XX=D1DataTransfer_1(); % transfer data from the ImagingDevice to Matlab and store it in XX; this is "1" in the drawings
Outout_3=CudaMexFunction(XX); % So, procedure "2" and "3" are sequentially done here, which mean I have the output of my processing now.
XX=D1DataTransfer_1(); % transfer data from the ImagingDevice to Matlab and store it in XX; this is "1" in the drawings
Outout_3=CudaMexFunction(XX); % So, procedure "2" and "3" are sequentially done here, which mean I have the output of my processing now.
XX=D1DataTransfer_1(); % transfer data from the ImagingDevice to Matlab and store it in XX; this is "1" in the drawings
Outout_3=CudaMexFunction(XX); % So, procedure "2" and "3" are sequentially done here, which mean I have the output of my processing now.
Matlab performs these commands line by line (from top to bottom). However, I need to run lines 2 and 3 (or let's say 4 and 5) asynchronously. I want Matlab to perfom line 3 while CudaMexFunction (line 2) is processing the XX (the input from line 1) in GPU. Is this possibe in Matlab?
Right, so you can't do the data transfer in C++ code inside your MEX function?
In which case use some form of parallel pool and run your data transfer in the background. You might want to look at the documentation for parfeval. You can just set a whole bunch of data transfer going in the background and use fetchNext to pull data off the queue at the start of each iteration.

Accedi per commentare.

Risposte (0)

Prodotti

Release

R2021a

Commentato:

il 12 Lug 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by