load text file

I have a text file that is too large to load with the file=load'file.txt' command. it's basically many rows of numbers where each column separated by a space.
1 2 3 1 4 2 1 56 2....
1 4 1 56 2 3 4 2 3 ....
....
Each row contains data from several channels. So for instance data from channel 1 would be 1 2 3 1 4 1, first 3 numbers of every line:
What I'd like to do is to filter and downsample data from each channel.
so basically I'm wondering what command is best to use to pick out specific parts of text file without using "load"

1 Commento

Fangjun Jiang
Fangjun Jiang il 3 Nov 2011
How large is the file? How many rows and columns?

Accedi per commentare.

 Risposta accettata

Walter Roberson
Walter Roberson il 3 Nov 2011

0 voti

samplesperchan = 3;
channelnum = 6; %for example
lineformat = [repmat('%*f', 1, samplesperchan*(channelnum-1)), repmat('%f',1,samplesperchan), '%*[^\n]'];
result = textscan(fid, lineformat, 'CollectOutput', 1);

8 Commenti

Baba
Baba il 3 Nov 2011
Walter,
i used your code:
samplesperchan = 3;
channelnum = 6; %for example
lineformat = [repmat('%*f', 1, samplesperchan*(channelnum-1)), repmat('%f',1,samplesperchan), '%*[^\n]'];
result = textscan('data.txt', lineformat, 'CollectOutput', 1);
but the value that was stored in results is not what I expected,
I'm getting results as a 1x1cell and is empty
Walter Roberson
Walter Roberson il 3 Nov 2011
I used this code except with channelnum = 2, with the data file
1 2 3 1 4 2 1 56 2
1 4 1 56 2 3 4 2 3
result was a 1 x 1 cell array and result{1} was
1 4 2
56 2 3
Perhaps you do not have as many as 6 channels of data in your file? channelnum should be set to the channel number you want to fetch, under the assumption that each line has samplesperchan samples per channel.
Baba
Baba il 3 Nov 2011
oh ok I got the correct answer, thanks.
If I wanted to transpose each row of result{1} and align them in a column would I have to use reshape function?
to get
1
4
2
56
2
3
Walter Roberson
Walter Roberson il 3 Nov 2011
If you want to convert
1 4 2
56 2 3
to the column vector
1
4
2
56
2
3
then use
reshape(result{1}.',[],1)
OR use
t = result{1}.';
t(:)
Baba
Baba il 3 Nov 2011
wow, thanks!
Baba
Baba il 3 Nov 2011
will this method work if my textfile is 20GB and the data that will be pulled out with your code is 400mb?
Baba
Baba il 3 Nov 2011
I'm getting an error:
??? Error using ==> textscan
Buffer overflow (bufsize = 4095) while reading string from
file (row 1, field 10). Use 'bufsize' option. See HELP
TEXTSCAN.
Error in ==> Untitled5 at 8
result = textscan(fid, lineformat, 'CollectOutput', 1);
and buffsize maximum size is 4095 bytes
is there a workaround that you know of?
Walter Roberson
Walter Roberson il 4 Nov 2011
It sounds as if you might need to use 'bufsize' with a large number -- as large as your longest line.

Accedi per commentare.

Più risposte (1)

Ora Zyto
Ora Zyto il 3 Nov 2011

0 voti

Depending on the format of your data, TEXTSCAN may be a good option. This is a very flexible function, where you can specify the format of your data programatically.
You could read lines one by one and parse the data directly. If your data is rectangular, you could also try reading it all at once.
Ora

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by