Why is MATLAB engine for Python so much slower in R2023b vs. R2021b?

38 visualizzazioni (ultimi 30 giorni)
This is an expansion of a comment I made on this MATLAB Answers question on why the MATLAB engine for Python is so slow.
My original issue was that a 30MB variable was taking about 15 seconds to be sent from Python to a shared MATLAB engine (interactive desktop session opened the usual way by double clicking the desktop launcher) with the following method:
% MATLAB
matlab.engine.shareEngine('test')
% Python
import matlab.engine
eng = matlab.engine.connect_matlab(name = "test")
engine.workspace["variable"] = variable
This led me to the above linked MATLAB answers question about why the Python engine is so slow. So per Alan Frankel's answer I upgraded from R2021b to R2023b, which actually made my Python code slower (at least 3x slower) and also introduced a massive memory leak, see the two attached images.
Some pertinent machine specs: 4 year old PC with 64GB RAM running Windows 10 version 22H2, Python version 3.9.0, MATLAB R2021b and R2023b with no other toolboxes installed.
If anyone could shed light on this situation I'd appreciate it. My primary goal is to be able to use my 30MB Python variable as an input to a MATLAB function in a reasonable time frame, meaning definitely < 1 second rather than the current 15 seconds. Small variables like a scalar number appear in the MATLAB workspace instantly, so I guess it's related to the size of my variable. But if 30MB is too large then the MATLAB engine is not very useful for intensive data processing.
  9 Commenti
Mitchell Tillman
Mitchell Tillman il 15 Mar 2024
Yes I believe so, as I’ve been able to use the code without slowdown, though I didn’t test that after the np.array() fix.
There is however a pretty massive memory leak still in the Apple Silicon macOS implementation, regardless of NaN’s in the data.
Alan Frankel
Alan Frankel il 18 Mar 2024
As an alternative to np.array(), you can use matlab.double(). (If you were using int8, you'd use matlab.int8(), and so on.) Here, you would change the line:
matrix = [[float('nan'), float('nan'), float('nan')] for i in range(N)]
to this:
matrix = matlab.double([[float('nan'), float('nan'), float('nan')] for i in range(N)])
For me, this decreased the amount of time it took to pass a struct to MATLAB from 30 seconds to 0.09 seconds.

Accedi per commentare.

Risposta accettata

Alan Frankel
Alan Frankel il 14 Mar 2024
@Mitchell Tillman and I agree that the slow performance and memory leaks seem to be related to instances of NaN in the data, not to the code. I will look into the NaN issue separately.

Più risposte (0)

Prodotti


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by