How do you avoid unnecessary copies when modifying table objects?

I'd like to avoid the performance cost of copying large amounts of data inside of tables. For example, if I wanted to add or remove columns from the table via a function, how would you do that without it creating a copy? Do you need to use handle classes or can this be acheived with the built-in tables?

 Risposta accettata

Matt J
Matt J il 17 Feb 2022
Modificato: Matt J il 17 Feb 2022
I think adding and removing columns from tables is largely like adding/removing elements from cell vectors, i.e., it involves no data copying.

7 Commenti

Matt is correct. Copying arrays in MATLAB is very fast because MATLAB arrays have copy-on-write semantics. See the following documentation page for more information.
If I wanted to add or remove columns from the table via a function, how would you do that without it creating a copy?
Using the documented workflows for manipulating tables should be fast. For example,
Letter = ["a";"b";"c"];
T = table(Letter)
T = 3×1 table
Letter ______ "a" "b" "c"
% Add a variable to T.
Number = [1;2;3];
T = addvars(T,Number,'Before','Letter')
T = 3×2 table
Number Letter ______ ______ 1 "a" 2 "b" 3 "c"
% Remove a variable from T.
T = removevars(T,'Letter')
T = 3×1 table
Number ______ 1 2 3
Please let me know if you have encountered any workflows where table operations are not fast enough and I can provide some suggestions.
Thanks for the replies. Just so I fully understand, is no copy generated because the assigned output variable is the same as the input one? My understanding of the copy-on-write behavior is that functions that modify the input value, like addvars and removars, should create a copy. @Seth Furman
Matt J
Matt J il 18 Feb 2022
Modificato: Matt J il 18 Feb 2022
One way to see that deleting columns doesn't allocate any new memory is with the test below. You can see that "Memory used by Matlab" has not changed, even after the creation of a copy T2 of table T1 with one fewer variables,
>> T1=array2table(rand(1e4));
>> memory
Maximum possible array: 4588 MB (4.811e+09 bytes) *
Memory available for all arrays: 4588 MB (4.811e+09 bytes) *
Memory used by MATLAB: 3458 MB (3.626e+09 bytes)
Physical Memory (RAM): 16250 MB (1.704e+10 bytes)
* Limited by System Memory (physical + swap file) available.
>> T2=T1(:,1:end-1);
>> memory
Maximum possible array: 4583 MB (4.805e+09 bytes) *
Memory available for all arrays: 4583 MB (4.805e+09 bytes) *
Memory used by MATLAB: 3458 MB (3.626e+09 bytes)
Physical Memory (RAM): 16250 MB (1.704e+10 bytes)
* Limited by System Memory (physical + swap file) available.
Thanks for providing that test! I see it's not making a copy.
A table is a container variable. It contains a number of components, each of which has its own type/size header, and its own data pointer. When you create a copy of a table that does not have one of the variables, then the type/size/data-pointer information is copied, without making a deep copy of the data. In C/C++ talk, tables contain a pointer to the data and copies of the tables copy the pointers (except that there is a reference count so that MATLAB can tell whether it needs to make a copy of the data when the data is eventually modified.)
Thanks for providing that test! I see it's not making a copy.
You're quite welcome. If your question is resolved, though, please Accept-click the answer.
Thanks @Walter Roberson, that clarifies it more for me.

Accedi per commentare.

Più risposte (0)

Prodotti

Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by