Azzera filtri
Azzera filtri

Reading a large .dat file or some parts of it

11 visualizzazioni (ultimi 30 giorni)
Dear All,
I have a very large DAT file (almost 16GB ). It contains the electricity usage of 8000 customers for around 4 years recorded at every 30 minutes (so it has something around 8000*4*365*24*2 rows!)
MS Excel allows me to open this file, however it's obvious that it loads only a part of it. Based on that I could figure out that the format is something like this:
990814, 246745, 0, 2012-07-22 20:00:00, 3.25, 0,0,0,0
which corresponds with:
CUSTOMER_KEY, CALENDAR_KEY, EVENT_KEY, READING_DATETIME, GENERAL_SUPPLY_KWH, CONTROLLED_LOAD_KWH, GROSS_GENERATION_KWH, NET_GENERATION_KWH, OTHER_KWH
My main problem is that when I want to load it into MATLAB it can't do it because of RAM memory problems.
I read about fopen, fread, fscanf, textscan, etc. However I couldn't figure out if its is possible to read only a part of this DAT file instead of whole of it? Is it any command to read from for example the row 100 to row 1000 of this DAT file before loading whole of it into memory?
I only need the usage of about 1000 customers for one month.
Thanks in advance for your help.

Risposta accettata

Walter Roberson
Walter Roberson il 25 Feb 2017
The calling sequence for textscan is:
textscan(SOURCE, FORMAT, COUNT, OPTIONS...)
where SOURCE is either a file identifier or a string, FORMAT is a string, and COUNT is the maximum number of times to apply the FORMAT.
So to read a particular portion of the file, you can use the Headerlines option to skip everything before there, and you can use the COUNT to give the number of lines to process.
It is not exactly number of lines, though, because if you have empty lines then unless you have carefully chosen your options, the empty line will be considered leading whitespace that is automatically ignored without incrementing the count. It is more that, provided there is enough data, the count will be the number of rows of data that are returned.
  3 Commenti
Rahimeh Rouhi
Rahimeh Rouhi il 8 Ott 2018
Dear Walter Roberson, could you please help. I have a big dataset of images in form of .mat files. I have a similar problem. I used matfile to save all the data on a hard disk and load some parts, but it is very slow. Which way is better to load a part of data into the workspace? writing the data into a text file and loading by the command you mentioned could be helpful?
Walter Roberson
Walter Roberson il 9 Ott 2018
How do you store the images inside the mst file? Cell array? One variable per image? Multiple dimensional array? Strut array?

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Data Import and Export in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by