Read binary with ascii header

9 visualizzazioni (ultimi 30 giorni)
David
David il 24 Nov 2015
Commentato: Guillaume il 24 Nov 2015
Hi,
I have a .vtu-file which contains numerical data (velocity value(s) at certain x,y,z-points) in binary format. The format and information about the file is given in a short (ascii!) header at the beginning of the file:
<VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian" compressor="vtkZLibDataCompressor">
<UnstructuredGrid>
<Piece NumberOfPoints=" 318105" NumberOfCells=" 1779159">
<Points>
<DataArray type="Float64" NumberOfComponents="3" format="binary">
The NumberOfPoints specifies the length of the file, and there should be 7 columns of data. As specified, the data array is given in Float64 format.
Anyway, the question is: how can I read in the ascii header, extract information from there, and the proceed by reading the rest of the file as binary.
Any ideas?

Risposte (2)

Guillaume
Guillaume il 24 Nov 2015
You can still use text reading operation with a binary file, so use fgetl (or fgets) to read your text (assuming the header actually consists of lines ending with a newline), then normal binary reading (with fread).
fid = fopen('yourfile.vtu');
header = arrayfun(@(~) fgetl(fid), 1:5, 'UniformOutput', false);
numpoints = str2double(regexp(header{3}, '(?<=NumberOfPoints=" +)\d+', 'match', 'once'));
datatype = regexp(header{5}, '(?<=type=")[^"]+(?=")', 'match', 'once');
points = fread(fid, [numpoints 7], lower(datatype)); %assume datatype is a valid type for fread. 'float64' is
fclose(fid);
Note that I assumed that the format of your header was constant. Always 5 lines with number of points on the 3rd and data type on the fifth.
  3 Commenti
Guillaume
Guillaume il 24 Nov 2015
float64 (in lowercase) is a valid type for fread. It is equivalent to double.
There can be several reasons for the binary part not to read properly:
- end of line issues. As you probably know there are different conventions for marking the end of a text line. Windows text files typically use '\r\n', Unix text files use '\n'. Other exotic formats may use something different. When you read files in text mode, matlab automatically detect the correct line ending and strip these off. In binary mode, I'm not sure what happens (doc is not clear on that), so possibly part of the end of the last line is left over and causes an offset error.
- Are you sure the data starts immediately after the text portion? This would also cause an offset error
- I assumed that 'float64' means IEEE 754 double precision float. It could mean something else.
Guillaume
Guillaume il 24 Nov 2015
Another two things:
- endianness seems to be encoded in the file. It may be safer to explicitly use it in the fread code. Although if you are on windows, you were already reading in little endian:
endianness = regexp(header{1}, '(?<=byte_order=")[^"]+', 'match', 'once');
points = fread(fid, [numpoints 7], lower(datatype), 0, lower(endianness(1))); %assume endiannes either starts with 'L' or 'B'
- more importantly, your header includes compressor="vtkZLibDataCompressor"> which would indicate that your data is compressed with zlib. If that is the case, then it is a lot more complicated to read it. There are submissions on the file exchange to decompress zlib.

Accedi per commentare.


Thorsten
Thorsten il 24 Nov 2015
Modificato: Thorsten il 24 Nov 2015
Use fgets to read the first ASCII lines and then use fread (with appropriate arguments) for the remaining binary stuff.
  2 Commenti
David
David il 24 Nov 2015
Hi,
thanks for the reply. I can easily read the first header lines using fgets, but I'm not sure what arguments to give fread (or whatever good function to use for the binary reading) in order for it to know how to skip the header lines.
Do you have any suggestion on that?
Image Analyst
Image Analyst il 24 Nov 2015
Here's 2 lines of code from one of my utilities that reads 2-D image slices from a raw 3-D image:
dataLengthString = '*uint16'; % You need the *, otherwise fread returns doubles.
oneFullSlice = fread(fileHandle, [rows, columns], dataLengthString);

Accedi per commentare.

Categorie

Scopri di più su Large Files and Big Data in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by