Best way to read in massive amount of data from csv files?

Question

nb4532 il 25 Giu 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/468818-best-way-to-read-in-massive-amount-of-data-from-csv-files

Risposto: Stephane Dauvillier il 25 Giu 2019

I am working with two CSV files ~25 GB each. When I read in one file all at once, I get a vector of size 9.8 GB. I only have about 24 GB of RAM, and with two vectors and further computations it is putting quite a strain on my computer. I was wondering if it was better in this case to read in the files piece by piece, and keep going back to them to read in the next data segment, or if I should load in all the data into memory at once? Either way I have to go through all the data, and timing is a consideration since at the present moment it takes nearly 20 minutes for my computer to read in one entire file into a vector. I imagine this time would increase were I to constantly go back and make more, albeit smaller, calls to csv read with row indexing?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Stephane Dauvillier il 25 Giu 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/468818-best-way-to-read-in-massive-amount-of-data-from-csv-files#answer_380738

Hi,

If you have huge data file(s), you may want to look at datastore.

First datastore can apply on fileS, folder.

Then if you don't specify it, datastore wit not deal file by file but block by block and will simply treat another file when the current one is finish.

For column data, you can specify which column to really import (very effective if you know you only wants some of the columns and not everyone).

Is your files have the same number of column and they contains the same "data" (I mean for instance column 1 in your two file represent the ssame observation like Name, age, height, ....)?

Look at the documentation page

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Best way to read in massive amount of data from csv files?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Best way to read in massive amount of data from csv files?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti