Save 'v7.3' apparently uses no compression - how to turn it on?

41 visualizzazioni (ultimi 30 giorni)
Hi,
How do I switch on compression when using 'save -v7.3'? This morning I ran into a problem using 'save'. For the first time I tried saving large variables which then forced me to use 'v7.3'. I thought this might be the end of it but apparently not.
Two problems with v7.3:
  1. It's VERY slow
  2. Files are HUGE compared to normal saving, which might explain why reading/writing is so slow
Externally zipping the produced *.mat file results in a file about the size of the one produced by just using 'save'. For example:
ones(15000);
save('normal.mat');
save('new.mat','-v7.3');
The first file is 778kB in size, the second one a whopping 11.4MB. Zipping the 11.4MB file results in a 228kB file, so theres much room for improvement. While 11MB could be handled, the same happens of course to larger variables. I just edited one of my physics simulations with many multidimensional arrays. Saving via the normal method gives a 500MB file, doing the same using '-v7.3' gives me a 6.3GB file. Zipping this one gives me a 480MB file. This is unacceptable, it can't be how this was intended to be used.
So apparently using 'save -v7.3' just doesn't compress the file. This makes no sense to me. Escpecially if this was specifically implemented to be used to save large variables, why is the compression not on?
How do I switch this on? Going through the documentation, I haven't found an option.
  9 Commenti
Cheeba
Cheeba il 4 Nov 2016
I have this exact same problem. When I use the '-v7.3' switch my files get enormously larger. Something is broken... I have yet to find a workaround. Very frustrating.
Bob photonics
Bob photonics il 27 Mar 2020
Yep I have the same problem in r2016a and r2018a, files that are smaller than 0.5GB otherwise turn into a 9.8GB sized filed and it's supposedly compressed...
Had to switch to 2016a because it's what we run on our linux servers and I was out of memory on my laptop.

Accedi per commentare.

Risposte (2)

per isakson
per isakson il 6 Set 2014
Modificato: per isakson il 6 Set 2014
Testing compression with ones(15e3) gives unrealistic results.
Instead test with random numbers
m = rand(15e3);
tic,save('normal.mat', 'm');toc
tic,save('new.mat','-v7.3','m');toc
sad = dir('n*.mat');
[sad.bytes]/1e9
or with a matrix that is closer to your real data.
  2 Commenti
arnold
arnold il 6 Set 2014
Modificato: arnold il 6 Set 2014
why? Just kidding :)
I do get what you mean but tons of people use or generate data which is not very random, even if the inital data might be. I.e. in some of the analysis I do discard most data and set it to nan or I use masking arrays which are logicals, yet huge with vast connected chunks. This type of data is ideal for compression, yet the v7.3 does nothing with it. I am using sparse matrices too but not knowing what the result might be, sparse matrices are not always ideal and can't replace everything.
In this case storing using this option is not a very good idea.
per isakson
per isakson il 7 Set 2014
Modificato: per isakson il 7 Set 2014
"yet the v7.3 does nothing with it" &nbsp -v7.3 does indeed compress ones(15000).
>> 15e3^2*8/11.4e6
ans =
157.8947
"arrays which are logical" &nbsp HDF5 doesn't have logical. I guess Matlab may use uint8 to store logical in -v7.3.
"In this case storing using this option is not a very good idea." &nbsp With 2GB+ items (cannot find find a better word), using HDF5 directly might be better. IMO: the Matlab support of HDF5 works well enough.

Accedi per commentare.


Walter Roberson
Walter Roberson il 5 Nov 2016

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by