Is there a faster way of splitting a cell array into numeric array while preserving NaN?

3 views (last 30 days)
I am trying to split a set of data into rows and columns of numeric data that will preserve the position of empty data (as NaN or anything similar).
The input data is a cell array with rows of strings. The columns are delimited by a semi-colon ' ; '. The first 8 columns are filled with garbage data and there are many trailing columns with no data at all. I even sometimes have rows with no data. The attached data sample is just 4,000 rows long but I actually have datasets that have between 50,000 and 300,000 rows.
I have been using the code below but the str2double step is incredibly slow. Can anyone offer an alternative approach that can cut down on the processing time?
% split data by the ' ; ' separator
data = cellfun(@(x) split(x,';'),data,'UniformOutput',false);
% get rid of preceding garbage data in columns 1 to 8
data = cellfun(@(x) x(9:end),data,'UniformOutput',false);
% convert data into double. This step is incredibly slow
data = cellfun(@str2double,data,'UniformOutput',false);
% example of next operations I wish to perform on this data
data_a = cellfun(@(x) x(1:2:end),data,'UniformOutput',false);
data_b = cellfun(@(x) x(2:2:end),data,'UniformOutput',false);
Thank you in advance for any help

Sign in to comment.

Accepted Answer

TADA on 22 Aug 2019
Edited: TADA on 22 Aug 2019
try this
endsWithSemicolon = cellfun(@(s) endsWith(s, ';'), data);
x = cellfun(@(s) textscan(s, '%f', 'Delimiter', ';', 'EmptyValue', nan(), 'Whitespace', ' *\n\t\r\b'), data);
x = cellfun(@(a) a(9:end), x, 'UniformOutput', false);
x(endsWithSemicolon) = cellfun(@(a) [a; nan], x(endsWithSemicolon), 'UniformOutput', false);

Sign in to comment.

More Answers (0)




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by