Csv file: reading values with comma

Hi dear friends,
I need to work with csv file from this website: "https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv"
Matlab is not able to read data with comma, and so the result is NaN.
I did the following code to import the data from the web, and to fix the semicolon separator:
%% Get the file from the web and save the file in the path
url = 'https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv';
filename = 'WP_50hz.csv';
options = weboptions('Timeout',60);
outfilename = websave(filename,url,options);
% Function to fix the file
[Wind_data] = adjustment_file(filename);
%% Adjustment_file
function [T] = adjustment_file(FILE)
% SECTION 1: Clean up and rewrite the file
% Read in the entire file as text
file = FILE;
txt = fileread(file);
% Remove all null characters
txtClean = regexprep(txt,char(0),'');
% Re-write the file with a new name
% * If "file" contains the full path, you'll need to change
% this line below to rename the file.
fid = fopen(['clean_',file],'w');
fprintf(fid, '%s',txtClean);
fclose(fid);
newfile = ['clean_',file]; % Use the full path whenever possible!
opts = delimitedTextImportOptions("NumVariables", 4);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW"];
opts.VariableTypes = ["datetime", "datetime", "datetime", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, "von", "InputFormat", "HH:mm");
opts = setvaropts(opts, "bis", "InputFormat", "HH:mm");
% Import the data
T = readtable(newfile, opts);
end
How can I handle the comma in the data to be able to read the values?
Thank you very much ; )

Risposte (1)

Jeremy Hughes
Jeremy Hughes il 18 Set 2020
Modificato: Jeremy Hughes il 21 Set 2020
I think you need to set the decimal separator and thousands separator. Also, duration may be the better type for two of the variables.
Also, you can just supply UTF-16 I think instead of needing to do the file cleaning.
opts = delimitedTextImportOptions("NumVariables", 6);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
opts.Encoding = "UTF-16";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW", "OnshoreMW", "OffshoreMW"];
opts.VariableTypes = ["datetime", "duration", "duration", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, ["von", "bis"], "InputFormat", "hh:mm");
opts = setvaropts(opts, 4:6, "DecimalSeparator",",","ThousandsSeparator",".");
% Import the data
T = readtable(file, opts);

4 Commenti

Hi Jeremy,
Thank you for your answer. However when I run this code I get an error:
"Warning: The encoding 'UTF-16' is not supported.
See the documentation for FOPEN.
> In matlab.io.internal.shared.EncodingInput/set.Encoding (line 19)
Error using matlab.io.ImportOptions/set.VariableNames (line 166)
Expected a cell array of valid variable names."
Rik
Rik il 21 Set 2020
Which release are you using?
That is a warning about UTF-16, not an error message. In the release you were using, officially UTF-16 was not supported, but in practice it worked fine.
As of R2020a, MATLAB will automatically detect the UTF-16LE encoding.
Also, the spaces in the variable names won't work in some earlier releases. Editing the steps to make sure those will work.

Accedi per commentare.

Categorie

Scopri di più su Holidays / Seasons in Centro assistenza e File Exchange

Richiesto:

il 18 Set 2020

Modificato:

il 21 Set 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by