Table performance very slow

Question

Byron il 19 Nov 2015

2
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow

Modificato: Victor il 26 Giu 2017

I have used tables within a physics model that is solved by ode23. The performance is very slow, and in troubleshooting (using profiler) I found that the majority of the time is spent in various table functions.

The three functions table.subsasgnDot, table.subsref, table.subsref alone take approximately 30% of the execution time. Within those functions it seems to be variable name checking that takes the majority of the time.

The variable names in every table are all known at the start of the program and don't change. It seems like it would be much far efficient to check once rather than every pass through the loop.

This is for a simulation that takes several minutes to run each case, and would be used to call many cases. So, the slow performance is a significant problem.

I understand performance is better when the problem is vectorized. One of the subroutines can calculate 15,000 points in 10 sec if called as a vector, but takes 1 hr if called in a loop.

However, since this problem is being solved with ode23 it is being called in a loop unavoidably and unfortunately I used tables everywhere before discovering how slow they are.

Is there any way to improve performance without major rewriting to remove all use of the table class?

2 Commenti
Mostra NessunoNascondi Nessuno

dpb il 19 Nov 2015

I'm guessing likely not for the question as asked but might there a relatively simple way you could do the queries first and return the necessary data as an ordinary array or arrays for the solver to crunch on?

Victor il 26 Giu 2017

Modificato: Victor il 26 Giu 2017

Added similar issue to Stackoverflow, it may be helpful: Matlab Table / Dataset type optimization

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Oleg Komarov il 28 Nov 2016

3
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_244998

Modificato: Oleg Komarov il 28 Nov 2016

I have been using table() way before they were introduced into the core package, since de facto they are the ported version of the dataset() class from the Statistics Toolbox. I also noticed long time ago many limitations in terms of performance and functionality, and have logged feature enhancements with TMW.

To address the limitations of the table(), while waiting for the ufficial implementation of my enhancement requests, I created the tableutils(). Among the problems, you would be astonished to know that the disp() of a big table can literally freeze your pc until the next ice age (and I am not talking about the movies...). This is somethig that I fixed with a buffered disp method.

While my tableutils() do not address directly the problems in subsref/subsasgn, anyone is welcome to contribute to this effort to make the table() class better by submitting an issue or a Pull Request on Github.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 2

Daniel Petrini il 5 Ott 2016

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_237457

Modificato: dpb il 5 Ott 2016

Apri in MATLAB Online

In my view: tables are very sporadic in perforance. Ranging from quick to very slow. I mean, do a clear and just >> table(). On mu 2016.b that can take many seconds. :-S I had to rewrite a (large) class based on tables to multiple vectors of same types. Performance is much more linear and trustworthy. Seems that the JIT does not know what to do with them? I wish Mathworks would post more about performance on these new data structures... In addition: the

<t=tic;my_class.insert_new_entry(...);toc(t)>

reported excellent times. Problem is that Matlab is "busy" and the output of toc(t) could take 2 sec to display (0.12 s)... What am I missing? I'm guessing it is some overhead in creating tables. i.e., table_1(1:5,my_col), creates a new table, and freezes...? Disclaimer: sitting on a 8 GB iCore7.

/Daniel Petrini, Stardots AB

2 Commenti
Mostra NessunoNascondi Nessuno

Daniel Petrini il 6 Ott 2016

My answer is probably not an answer, but rather a comment. Sorry. My first contribution to Matlab Answers.

Oleg Komarov il 28 Nov 2016

The native table.disp() has a huge problem, and can freeze your pc for a long time. I implemented a buffered disp, that avoids this issue. See my answer below.

Accedi per commentare.

Answer 3

jbpritts il 24 Nov 2016

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_244600

I have Matlab 2016b. I can confirm that tables are terribly slow. Unless you really need it for heterogeneous data, then avoid them in any performance critical code. I will have to rewrite a fairly complicated section of code using legacy data structures. Matlab should address this extreme performance deficiency.

2 Commenti
Mostra NessunoNascondi Nessuno

Image Analyst il 24 Nov 2016

They are tremendously better and faster than cell arrays though, and use far less memory.

Oleg Komarov il 28 Nov 2016

Internally, table() stores data in a cell array, where each column is a cell. So, your statement about speed and memory cannot be true, since there is additional overhead linked to VariableNames and matlab-coded subsref/subsasgn.

I do agree, that tables are more convenient.

Accedi per commentare.

Answer 4

Peter Perkins il 7 Ott 2016

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/256482-table-performance-very-slow#answer_237965

Apri in MATLAB Online

Byron, it's hard to make specific suggestions without knowing exactly what you're doing, but here are some thoughts.

Tables are best at managing data and doing vectorized operations. Based on your description, it sounds like you are probably doing scalar operations such as

t.Var(i) = x

in a loop. You've described your alternative as a complete rewrite to not use tables at all. But there is often a middle ground where you can find a localised scope in which you can pull some of the variables in a table out as, say, ordinary double vectors, do all the non-vectorized calculations on them in a scalar loop, and then assign back into the table. Sometimes you can even convert the table to a scalar struct and use the exact same syntax. Of course, separate variables or a scalar struct will not enforce equal number of rows, or provide a simple syntax for arbitrary rectangular selections, or the other things that tables are designed to do.

Hope this helps.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Table performance very slow

2 Commenti
Mostra NessunoNascondi Nessuno

Risposte (4)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno

2 Commenti
Mostra NessunoNascondi Nessuno

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

Table performance very slow

2 Commenti Mostra NessunoNascondi Nessuno

Risposte (4)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti Mostra NessunoNascondi Nessuno

2 Commenti Mostra NessunoNascondi Nessuno

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

2 Commenti
Mostra NessunoNascondi Nessuno

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno

2 Commenti
Mostra NessunoNascondi Nessuno

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti