How to Read and plot CSV file and delete infinity values from CSV file

10 visualizzazioni (ultimi 30 giorni)
Hi
I want to read csv file and later plot it by using some formula.
I used csvread, but it doesn’t work as the file contains text (column title) and infinity values which MATLAB read as ∞.
I tried with readtable it can read (c=readtable('sensor.csv');. But again I can’t delete infinity values from the file. When I try with c(isnan(c))=0; or c(~any(isinf(c))) = 0; it always give error
“Undefined function 'isinf' or ‘isnan’ for input arguments of type 'table'”
Can anyone help how to read large csv file (xlsread doesn’t work, file size>1gb) and delete the infinity values which MATLAB read ∞.
Small section of large file is attached sensor.csv
Using Matlab R2017b

Risposta accettata

per isakson
per isakson il 1 Mar 2020
Modificato: per isakson il 1 Mar 2020
Matlab provide many ways to read your csv-files. Here are two variants based on textscan, Read formatted data from text file or string .
sensor.csv contains "???", which I assume is your "infinity value". AFAIK, there is no way to make textscan() or any other reading function directly convert "???" to the numerical value, Inf.
In the script, R1, textscan() converts "???" to NaN and in a second step replaces NaN by Inf. That's fine if the file doesn't have missing values, since they also would be converted to Inf.
In the script, R2, the entire file is read to a character array, chr. Next all '???' are replaced by 'Inf' and finally textscan() parses chr. R2 requires more memory to store chr.
%% R1
fid = fopen( 'sensor.csv', 'r' );
cac = textscan( fid, '%f%f%f%f%f' ...
, 'HeaderLines',3, 'CollectOutput',true ...
, 'Delimiter',',', 'TreatAsEmpty','???' );
[~] = fclose( fid );
num = cac{1};
% num(isnan(num)) = inf; % why not just keep the NaNs
num(isnan(num)) = 0; % or replace NaNs by zero
display( num )
%% R2
chr = fileread( 'sensor.csv' );
% chr = strrep( chr, '???', 'inf' );
chr = strrep( chr, '???', '0' ); % or replace '???' by '0'
cac = textscan( chr, '%f%f%f%f%f' ...
, 'HeaderLines',3, 'CollectOutput',true ...
, 'Delimiter',',' );
num = cac{1};
display( num )
Both outputs
num =
0 Inf 1.8947 0.19107 -0.015871
4e-06 1.9911 1.9105 0.19107 0
8e-06 1.9911 1.8947 0.19107 0
1.2e-05 1.9751 1.9105 0.19107 0
1.6e-05 1.9911 1.9421 0.17514 0
2e-05 1.9751 1.9262 0.17514 0
2.4e-05 1.9911 1.8947 0.17514 0
2.8e-05 1.9911 1.8947 0.17514 0
3.2e-05 Inf 1.9105 0.19107 0.015871
3.6e-05 1.9911 1.9262 0.19107 0
4e-05 1.9911 1.9105 0.19107 0
4.4e-05 1.9911 1.9262 0.19107 0
4.8e-05 1.9911 1.8947 0.19107 0
>>
That's before I realised that you don't want Inf in your matrix.

Più risposte (3)

Bhaskar R
Bhaskar R il 20 Feb 2020
opt = detectImportOptions('sensor.csv', 'MissingRule', 'fill', 'NumHeaderLines', 2);
T = readtable('sensor.csv', op);
header = {'Time', 'Sensor_A', 'Sensor_B', 'sensor_C', 'Sensor_D'};
T.Properties.VariableNames = header;
  1 Commento
Khalil
Khalil il 20 Feb 2020
first line giving me error message
"Error using detectImportOptions
'MissingRule' is not a recognized parameter. For a list of valid name-value pair arguments, see the documentation for
detectImportOptions."

Accedi per commentare.


the cyclist
the cyclist il 20 Feb 2020
Modificato: the cyclist il 20 Feb 2020
I find that sometimes with these finicky imports it can be helpful to use the Import Data Tool.
I used that as a basis to make the following import script:
% If dataLines is not specified, define defaults
if nargin < 2
dataLines = [4, Inf];
end
% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 5);
% Specify range and delimiter
opts.DataLines = dataLines;
opts.Delimiter = ",";
% Specify column names and types
opts.PreserveVariableNames = true;
opts.VariableNames = ["Time", "Sensor A", "Sensor B", "sensor C", "Sensor D"];
opts.VariableTypes = ["double", "double", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Import the data
sensorDataTable = readtable(filename, opts);
% Convert to numeric
sensorDataArray = table2array(sensorDataTable);
  2 Commenti
the cyclist
the cyclist il 20 Feb 2020
That script will put NaN where the input file has ????.
You could then do
sensorData(isnan(sensorData)) = Inf;
to convert to Infinity.
Walter Roberson
Walter Roberson il 25 Feb 2020
Putting an explicit function header on it:
function sensorDataArray = ReadSensorTable(filename, dataLines)
% If dataLines is not specified, define defaults
if nargin < 2
dataLines = [4, Inf];
end
% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 5);
% Specify range and delimiter
opts.DataLines = dataLines;
opts.Delimiter = ",";
% Specify column names and types
opts.PreserveVariableNames = true;
opts.VariableNames = ["Time", "Sensor A", "Sensor B", "sensor C", "Sensor D"];
opts.VariableTypes = ["double", "double", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Import the data
sensorDataTable = readtable(filename, opts);
% Convert to numeric
sensorDataArray = table2array(sensorDataTable);
end

Accedi per commentare.


Walter Roberson
Walter Roberson il 21 Feb 2020
https://www.mathworks.com/help/matlab/ref/rmmissing.html can be used since R2016b. That is, you use readtable() and let the ??? be replaced by nan; after that rmmissing will remove the rows that have any nan, which seems to be what you are asking for.
There is also fillmissing() which tries to deduce what reasonable values might be in place of each nan.
  4 Commenti
Khalil
Khalil il 25 Feb 2020
it gives this error "You can only call nargin/nargout from within a MATLAB function. "
Walter Roberson
Walter Roberson il 25 Feb 2020
I posted a version of his code, above, made into a function.

Accedi per commentare.

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by