Extracting column from text file

5 visualizzazioni (ultimi 30 giorni)
b
b il 29 Mag 2020
Commentato: b il 30 Mag 2020
The NOAA atmospheric data file with the entries defined by the header row comes in the following format:
USAF WBAN STATION NAME CTRY ST CALL LAT LON ELEV(M) BEGIN END
007018 99999 WXPOD 7018 +00.000 +000.000 +7018.0 20110309 20130730
007026 99999 WXPOD 7026 AF +00.000 +000.000 +7026.0 20120713 20170822
007070 99999 WXPOD 7070 AF +00.000 +000.000 +7070.0 20140923 20150926
008260 99999 WXPOD8270 +00.000 +000.000 +0000.0 20050101 20100920
Trying to extract a given column, which can be 'Elev' or 'USAF' or 'STATION NAME' etc. It is known apriori which column needs to be extracted, for example, column #1 (USAF). Running into problems because the 'STATION NAME' sometimes has a blank in between its alphanumeric code and sometimes it is just one code without any blanks. Also, other fields can be blank sometimes, for example CTRY. In the above 4 lines of the shortened input file, 'ST' and 'CALL' are empty, but they can be filled (and are usually alphabet codes).
Also,
(1). how to extract the USAF entries corresponding to only CTRY==AF ?
(2). how to extract all the rows with rowNumber=10000 to rowNumber=20000 (say).
Thanks.

Risposta accettata

per isakson
per isakson il 29 Mag 2020
Modificato: per isakson il 30 Mag 2020
This is a fixed-width text file. The documentation includes a good description on how to read fixed-width text files.
See
Be careful to get the column widths right.
Also,
  1. how to extract the USAF entries corresponding to only CTRY==AF ?
  2. how to extract all the rows with rowNumber=10000 to rowNumber=20000 (say).
Use readtable() and read all rows (if that doesn't cause memory problems). The tools you need comes with table.
In response to comments
Since there are no delimiters in the data file, I find the message
Line 3 has 9 delimiters, while preceding lines have 8.
misleading. Even if one sequences of char(32) is counted as one delimiter the numbers 9 and 8 doesn't make sense.
I created the script below in three steps
  1. Create the obj, opts, with default values. Inspect opts
  2. Type opts.<tab> in the Command Window. (<tab> stands for tab-completion). I identified four properties, the default values of which were not meaningsful. I added statements to the script to assign values, which I found in the comments. (To save me some trouble in the future, I modified the names to become legal Matlab names.)
  3. Read the file with readtable().
%%
ffs = fullfile('d:\m\cssm\noaa1lineHeaderFirst15lines.txt');
opts = fixedWidthImportOptions; % default values
%%
opts.DataLines = [ 2, inf ];
opts.VariableNames = { 'USAF','WBAN','STATION_NAME','CTRY','ST' ...
, 'CALL','LAT','LON','ELEV_M_','BEGIN','END' };
opts.VariableTypes = { 'double','double','char','char','char','char' ...
, 'double','double','double','double','double' };
opts.VariableWidths = [ 7, 6, 30, 5, 3, 6, 8, 9, 8, 9, 9 ];
%%
tbl = readtable( ffs, opts );
No eror messages so far.
>> tbl
tbl =
14×11 table
USAF WBAN STATION_NAME CTRY ST CALL LAT LON ELEV_M_ BEGIN END
_____ _____ _____________________ ____ __ ______ ______ ______ _______ __________ __________
7018 99999 'WXPOD 7018' '' '' '' 0 0 7018 2.011e+07 2.0131e+07
7026 99999 'WXPOD 7026' 'AF' '' '' 0 0 7026 2.0121e+07 2.0171e+07
7070 99999 'WXPOD 7070' 'AF' '' '' 0 0 7070 2.0141e+07 2.0151e+07
8260 99999 'WXPOD8270' '' '' '' 0 0 0 2.005e+07 2.0101e+07
8268 99999 'WXPOD8278' 'AF' '' '' 32.95 65.567 1156.7 2.0101e+07 2.012e+07
8307 99999 'WXPOD 8318' 'AF' '' '' 0 0 8318 2.01e+07 2.01e+07
8411 99999 'XM20' '' '' '' NaN NaN NaN 2.016e+07 2.016e+07
...
Looks ok
  10 Commenti
per isakson
per isakson il 30 Mag 2020
See my answer, I've added a script that reads your sample data file.
b
b il 30 Mag 2020
Works perfect.
Some people on this site (Matlabcentral Answers) are extremely helpful. How can their help be ever repaid ?

Accedi per commentare.

Più risposte (1)

b
b il 29 Mag 2020
Modificato: b il 29 Mag 2020
Thanks for suggesting the paper clip icon.
I have attached the file (short one), but it contains almost everything the rest of the (very) big file contains.
The errors are while using this short test file as an input.

Categorie

Scopri di più su Data Import and Export in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by