How to import data from custom (.H00) text file

Hi,
I have a DAQ (Opto22 Snap Pac) that provides data in a custom text file .H00.
The data looks like this:
Date,Time,PAC SNAP R2:T1_Type_C.Value,PAC SNAP R2:T2_Type_C.Value,PAC SNAP R2:T3_Type_C.Value,PAC SNAP R2:T9_Type_K.Value,PAC SNAP R2:T10_Type_K.Value,PAC SNAP R2:T11_Type_K.Value,PAC SNAP R2:Xe_Flow_Measured_out,PAC SNAP R2:V_Discharge,PAC SNAP R2:V_Keeper,PAC SNAP R2:V_Heater,PAC SNAP R2:I_Discharge_Measured_out,PAC SNAP R2:I_Keeper_Measured_out,PAC SNAP R2:I_Heater_Measured_out,PAC SNAP R2:Chamber_Pressure
2016/05/26,09:44:08,20.9,20.7,21.2,23.5,22.5,22.8,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000
2016/05/26,09:44:09,20.9,20.7,21.2,23.5,22.5,22.8,0.112,0.004,0.031,0.001,-0.015,-0.005,0.002,5.762
2016/05/26,09:44:10,20.9,20.7,21.2,23.5,22.5,22.8,0.123,0.004,0.031,-0.001,-0.015,-0.005,0.002,5.762
2016/05/26,09:44:11,20.9,20.7,21.2,23.5,22.4,22.8,0.123,0.008,0.031,0.001,-0.015,-0.005,0.002,5.762
I am using textscan to import it in Matlab. Ideally I would like to get every column as float in column vectors.
So far I can get the header as
A_text = textscan(fileID,'%s',16,'Delimiter',',');
And the data as:
A = textscan(fileID,'%s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s',4,'Delimiter',',');
For some reason, there are spaces between every character that Matlab import, but in the source file there are not. That is making the import process very hard. E.G:
- I would like to import the first column with %{yyyy/MM/dd}D, but because of the spaces, it does not work.
- Column 12 can not be converted from "cell2mat" as the minus makes the size of the matrices inside the cell not consistent.
Any help will be welcome. Thank you.

 Risposta accettata

Cedric
Cedric il 27 Mag 2016
Modificato: Cedric il 27 Mag 2016
Could you attach one of these files?
If you had no issue with encoding, the following would be an efficient way to do it:
fId = fopen( 'data.h00', 'r' ) ;
nCols = sum( fgetl( fId ) == ',' ) + 5 ;
data = reshape( fscanf( fId, '%f%*c' ), nCols, [] ).' ;
fclose( fId ) ;
where we actually read the block of numbers in one shot and reshape it so it matches the structure of columns, where date and time are split in their three components. The first line (header) is read just for counting columns (nCols = #commas + 1, and including the split of date and time we add 2 + 2).

1 Commento

Cedric
Cedric il 28 Mag 2016
Modificato: Cedric il 28 Mag 2016
You can open the file as Walter suggests above, and disable the warning if it bothers you:
warning( 'off', 'MATLAB:iofun:UnsupportedEncoding' ) ;
and use TEXTSCAN/FSCANF, or pass every other character to TEXTSCAN/FSCANF after reading the file as UTF-8:
fId = fopen( 'RD160527.H00', 'r' ) ;
nCols = sum( fgetl( fId ) == ',' ) + 5 ;
buffer = fread( fId, Inf, '*char' ) ;
data = reshape( sscanf( buffer(2:2:end), '%f%*c' ), nCols, [] ).' ;
fclose( fId ) ;

Accedi per commentare.

Più risposte (3)

Walter Roberson
Walter Roberson il 27 Mag 2016
I predict that the file is UTF-16 encoded and that you did not give the encoding when you fopen()'d the file.

1 Commento

The file is UTF-16LE with no BOM (Byte Order Marker). Open it with
fid = fopen('RD160527.H00', 'r', 'n', 'UTF16-LE');
and ignore the warning about UTF16-LE not being supported. textscan() should be able to handle it fine once it is open like above.

Accedi per commentare.

Pablo Guerrero
Pablo Guerrero il 27 Mag 2016
Thank you Walter.
Your code tells me the encoding is "windows-1252", I hope textscan can deal with it.

1 Commento

Even specifying the encoding, the spaces are still there
fileID = fopen([PathName,FileName],'r','l','windows-1252');

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by