fwrite and MATLAB for a raid0 disk - Only one lane?

4 visualizzazioni (ultimi 30 giorni)
Hello everyone,
I have a raid0 NVMe disk (made up of 4 NVMe disks connected together through a PCIe card adaptator).
The disk works great (up to 12GB/s OUTSIDE MATLAB, PCIe 3.0) but I cannot reach such speed in MATLAB.
It looks like MATLAB is using a single bus lane (aka 3.5GB/s) to write the data to the disk (simple example):
data = randn(1024, 1024, 1024, 'double'); %8 GB
fid = fopen('test.bin', 'W');
tic;
fwrite(fid, data(:), 'double');
toc;
fclose(fid);
Takes about 2.3 seconds which is about 3.5 GB/s so like using one lane... where the raid0 uses 4 lanes (4x4 PCIe).
I am running out of solution, this is not related to the disk/raid0 itself; I tested a lot of raid0 configuration (bios, VROC, Windows raid), the issue only occur in MATLAB. Using hd5f files does not solve that issue, it seems to be related to MATLAB itself.
FYI: I need such speed, in my field/lab we are creating about 1TB data per 5 min the bottleneck is always related to saving the data.
EDIT 1: Removed "b" argument from "fopen"
EDIT 2: Added type "double" to "fwrite"
Thank you a lot.
  5 Commenti
Walter Roberson
Walter Roberson il 30 Mar 2022
Getting high speed transfer to disk can require using special system calls. I do not have any information about how it is done in Windows; in Linux apparently there are methods that can avoid round-trips to user mode. It is unlikely that MATLAB implements those methods.
In Windows... I don't know. Is WriteFileEx still used in practice? https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefileex That does asynchronous writes, which historically has been an important step in performance improvement. Or perhaps WriteFileGather() https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefilegather ?
In a logging situation, you would like to be able to grab a buffer full of input, schedule it to be written, and continue on without waiting for the I/O to complete.
I suspect that MATLAB simply uses C or C++ fwrite() https://www.cplusplus.com/reference/cstdio/fwrite/ which waits for I/O to complete
Vincent Perrot
Vincent Perrot il 30 Mar 2022
@Walter Roberson I did a MEX file using WriteFile without success. I will try some asynchronous writes with WriteFileEx and also try WriteFileGather.
I did contact the support to get some answers about that.
I tried fwrite/ofstream/WriteFile (MEX files) even in chuncks, without any success.
Thanks for taking the time, I will read those links and try those approaches.

Accedi per commentare.

Risposte (2)

Jan
Jan il 29 Mar 2022
Modificato: Jan il 29 Mar 2022
What about trying it as C-Mex?
data = randn(1024, 1024, 1024, 'double'); %8 GB
tic
uglyCWrite(data);
toc
// Short hack, UNTESTED!!!
// uglyCWrite.c
#include "mex.h"
#include <stdio.h>
#include <stdlib.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *data;
size_t n, w;
File *fid;
data = (double *) mxGetData(prhs[0]);
n = mxGetNumberOfElements(prhs[0]);
w = mxGetElementSize(prhs[0]);
fid = fopen("test.bin", "w");
fwrite(data, n, w, fid);
fclose(fid);
}
  2 Commenti
Vincent Perrot
Vincent Perrot il 29 Mar 2022
Modificato: Vincent Perrot il 29 Mar 2022
Thank you for taking the time to put that piece of code together.
This morning I tested several MEX implementations from this post: https://stackoverflow.com/questions/70126690/write-binary-file-to-disk-super-fast-in-mex
Those are not faster than fwrite in MATLAB:
void writeBinFile(int16_t *data, size_t size)
{
FILE *fID;
fID = fopen("file_fopen.bin", "W");
fwrite(data, sizeof(int16_t), size, fID);
fclose(fID);
}
void writeBinFileFast(int16_t *data, size_t size)
{
ofstream file("file_ostream.bin", std::ios::out | std::ios::binary);
file.write((char *)&data[0], size * sizeof(int16_t));
file.close();
}
void writeBinFilePartByPart(int16_t *int_data, size_t size)
{
size_t part = 64 * 1024 * 1024;
size = size * sizeof(int16_t);
char *data = reinterpret_cast<char *> (int_data);
HANDLE file = CreateFileA (
"windows_test.bin",
GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
// Expand file size
SetFilePointer (file, size, NULL, FILE_BEGIN);
SetEndOfFile (file);
SetFilePointer (file, 0, NULL, FILE_BEGIN);
DWORD written;
if (size < part)
{
WriteFile (file, data, size, &written, NULL);
CloseHandle (file);
return;
}
size_t rem = size % part;
for (size_t i = 0; i < size-rem; i += part)
{
WriteFile (file, data+i, part, &written, NULL);
}
if (rem)
WriteFile (file, data+size-rem, rem, &written, NULL);
CloseHandle (file);
}

Accedi per commentare.


Jeremy Hughes
Jeremy Hughes il 29 Mar 2022
I was playing around with this and found that this is much faster (by a factor of 3 on my machine):
fwrite(fid,data(:),"double");
  1 Commento
Vincent Perrot
Vincent Perrot il 29 Mar 2022
Modificato: Vincent Perrot il 29 Mar 2022
Thank you.
Sadly we tried it, this is how I got the 3.5GB/s I was talking about in my first message.
I played around with the code and forgot to put it back in my question, sorry about that.
I edited my question, we are still at 3.5GB/s instead of 12 GB/s ish.

Accedi per commentare.

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by