importing tab delimited text file

63 visualizzazioni (ultimi 30 giorni)
Danielle Leblanc
Danielle Leblanc il 27 Lug 2011
Hi,
I am downloading a text file "A" using textscan. I know that the file is table delimited with unknown number of columns and 34000 rows. Some columns are numbers ,others are strings. I used textscan('A.txt','%s') and what I am getting is 1x1 cell called data . This 1 cell data is 34000x1 where each row contains all the data of the respective rows in the original file (i.e not separated to columns and there isn't any space that separates the rows). Any help about how to download the file is appreciated
  1 Commento
Walter Roberson
Walter Roberson il 30 Lug 2011
textscan('A.txt','%s') is going to give you the cell array {'A.txt'} -- if the first argument to textscan() is a string, then the string itself is considered to be the input to be scanned.

Accedi per commentare.

Risposte (5)

Fangjun Jiang
Fangjun Jiang il 28 Lug 2011
try a=importdata('A.txt') to see what you got. Many times, it will give your well-formatted data.
  4 Commenti
Zeenat Islam
Zeenat Islam il 9 Ago 2017
importdata ALL THE WAY! Worked like a charm for me
Walter Roberson
Walter Roberson il 9 Ago 2017
Note that in more recent versions, importdata by default now returns text as string objects instead of as cell arrays of character vectors. We are seeing people getting caught by that.

Accedi per commentare.


Walter Roberson
Walter Roberson il 30 Lug 2011
What is the delimiting character? Can any of the strings contain the delimiting character, and if so then how is it indicated that that delimiter is a part of the string rather than marking the end of the string?
  2 Commenti
Danielle Leblanc
Danielle Leblanc il 30 Lug 2011
the delimiting character is ' ', i.e columns are separated by a space
Danielle Leblanc
Danielle Leblanc il 30 Lug 2011
also teh columns are a mix of strings and numerics. I tried textscan('A.txt','%q') and ('A.txt','%c') but I obtained the same output as ('A.txt','%s')

Accedi per commentare.


Danielle Leblanc
Danielle Leblanc il 30 Lug 2011
Hi again,
sorry to bother you with this problem but I am a matlab beginner. I opened one of tmt text files with excel.The data has 33 columns and I am putting the transpose of the columns below as each has a different format (col1 col2 etc.. is the column number and what follows it is the data format):
col1 5
col2 6154
col3 T
col4 ABN.GG
col5 ABN
col6 00077T
col7 AA2
col8 N
col9 N
col10 4000
col11 A
col12 104.61
col13 +
col14 7.226596
col15 A
col16 20090109
col17 10:21:00
col18 0
col19 @
col20 A
col21 Y
col22 7
col23 104.61
col24 +
col25 7.226596
col26 104.61
col27 +
col28 7.226596
col29 104.61
col30 +
col31 7.226596
col32 11636
col33 20090109
excel recognized the file correctly. I have 400 of these text files so importing them from text to excel then to matlab would take a lot of time. How can I import this text file to matlab directly
  1 Commento
Fangjun Jiang
Fangjun Jiang il 30 Lug 2011
It's hard to read your data due to format. Can you paste 3 or 4 lines of your text file here and apply the code format to it?

Accedi per commentare.


Walter Roberson
Walter Roberson il 30 Lug 2011
You continue to have the same problem that I warned about earlier: textscan() with a string as its first argument reads the string, not a file denoted by the string. The older textread() routine expected a filename as the first argument, but textscan() never does.
fid = fopen('A.txt','rt');
data = textscan(fid, '%f %f %s %s %s %s %s %s %s %f %s %f %s %f %s %f %s %f %s %s %s %d %f %s %f %f %s %f %f %s %f %f %f');
fclose(fid);
The spaces within the quoted string are not important and can be left in or removed as desired.
I coded this in such a way that the dates such as 20090109 are read as numbers, but the time such as 10:21:00 is read as a string. textscan() is not able to directly read formatted times as times.
The output, data, will be a cell array, containing one column vector per column of input, so for example data{2} would be a column vector of floating point numbers corresponding to column 2, one entry per line of input.
  3 Commenti
Danielle Leblanc
Danielle Leblanc il 30 Lug 2011
I tried it on other files as well. The output is different but the content of each cell of the cell array never exceeds 2 rows of data although I have 35000 rows per file. for example I am getting:
[95;2722]
[102.680000000000;9730]
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
<2x1 cell>
[NaN;9000]
<2x1 cell>
[NaN;102.050000000000]
<2x1 cell>
[NaN;8.80793700000000]
<2x1 cell>
[NaN;11]
<2x1 cell>
[NaN;0]
<2x1 cell>
<2x1 cell>
<2x1 cell>
0
NaN
''
NaN
NaN
''
NaN
NaN
''
NaN
NaN
NaN
Fangjun Jiang
Fangjun Jiang il 30 Lug 2011
You need to check the consistency of your text file. As long as the data format is consistent. The code Walter provided should return correct result. I construct three lines of text using your data. Here is the result:
A.txt
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
5 6154 T ABN.GG ABN 00077T AA2 N N 4000 A 104.61 + 7.226596 A 20090109 10:21:00 0 @ A Y 7 104.61 + 7.226596 104.61 + 7.226596 104.61 + 7.226596 11636 20090109
data 1x33 5640 cell
>> data{1}
ans =
5
5
5
>> data{33}
ans =
20090109
20090109
20090109
>> data{18}
ans =
0
0
0

Accedi per commentare.


Stephan Koehler
Stephan Koehler il 7 Set 2011
I wrote a routine for importing tsv files generated by excel. look at http://www.mathworks.com/matlabcentral/fileexchange/32782

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by