Reading data from a website

2 visualizzazioni (ultimi 30 giorni)
Salma fathi
Salma fathi il 19 Ott 2022
Modificato: Salma fathi il 24 Ott 2022
Hello, Ia m trying to read some data from the follwing website, I tried to use webread to acheive this but it would give me an array of charcters with size 1x1534400 char where we would like our data to be read into a table. I used the following lines
url="https://lgdc.uml.edu/common/DIDBGetValues?ursiCode=JI91J&charName=foF2,foF1,foE,foEs,hmF2,hmF1,hmE&DMUF=3000&fromDate=1997%2F01%2F01+00%3A00%3A00&toDate=1997%2F12%2F31+11%3A59%3A00";
%options = weboptions("ContentType", "text");
data = webread(url);
the website looks like the image belwo and we would like to ignore the first text lines and start reading from the heading of the table
if anyone can help with this, thanks in advance

Risposte (1)

Walter Roberson
Walter Roberson il 19 Ott 2022
After you read the characters you can pass them to textscan as the first parameter. You should see either pass Headerlines or CommentStyle to skip the header. Use a %{}T format to describe the datetime and %s for the text fields.
  1 Commento
Salma fathi
Salma fathi il 24 Ott 2022
Modificato: Salma fathi il 24 Ott 2022
Thank you for the help, I tried what you suggested and it worked fine except that I am facing one issue:
  1. I am able to read only to the third column, I think the issue is that forth column has the characters "/_", so I tried to ignore these columns using the '*', but I would still get the following fields to that as empty cells. Attached is an image to the cell array I get.
This is the line that I am using
C = textscan(data, '%{uuuu-MM-dd''T''HH:mm:ss.SSS''Z}D%f%f%*s%f%*s%f%*s%f%*s%f%*s%f%*s%f%*s' , 'whitespace', ' ', 'CommentStyle' , '#');
Any advice?

Accedi per commentare.

Categorie

Scopri di più su Data Import and Export in Help Center e File Exchange

Prodotti


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by