More efficient jsonencode for large data?

12 visualizzazioni (ultimi 30 giorni)
Seb
Seb il 20 Lug 2017
Commentato: Joris Brouwer il 11 Ago 2022
So I am using matlabs jsonencode function to encode a structure array to a character array, and then write this to an output json text file. The structure array mat file equalls approximately 10GB. This takes both a long time, and alot of computer memory.
I then import the JSON text file into mongodb.
Is there a more efficient way to directly get the data into mongodb? Maybe the only option is the MatLab database toolbox...?
Thanks

Risposte (3)

Seb
Seb il 1 Ago 2017
I have found a temporary way to better do this.
I still use matlabs built in jsonencode function, and output a JSON file, but I do it in small stages/chunks. For example, I have a devision factor, which determines how many chunks I do. This then splits the data structure into row indices. I then encode the json within those row indices, and write that chunk of data to the file. I then go onto the next chunk and so forth until all data is written.
If someone is in need of the code then let me know and I can provide it. It does not take any less time than a single jsonencode call would be, however greatly reduces load and memory consumption on the computer, and can be made to be 32-bit compliant (each chunk cant be larger than ~1gb)
  1 Commento
Marcel
Marcel il 17 Apr 2020
Hi Seb, do you still have the code for this? I am interested in this. How do you puzzle together the parts in the json file? THANKS

Accedi per commentare.


Carl
Carl il 25 Lug 2017
Modificato: Carl il 25 Lug 2017
You can try following the example in this File Exchange function, which uses the MongoDB Java driver to insert a document:
The following Stackoverflow post also has some good suggestions:
With these approaches, you should be able to avoid writing text to disk, which may be more efficient. Note that this workflow has not been qualified, and it may or may not be more efficient than your current method.

Marco Rossi
Marco Rossi il 28 Lug 2021
Is matlab development team planning to remove this limitation? I also noticed that, when the 32 bit limit is overtaken, no warnings/errors are returned.
  1 Commento
Joris Brouwer
Joris Brouwer il 11 Ago 2022
I second that. Running into what seems to be exponentionally / halting jsonencode performance as well. Trying the chunked solution mentioned above.

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by