How to read a table from an url?

32 visualizzazioni (ultimi 30 giorni)
tom3w
tom3w il 2 Dic 2016
Risposto: Toshiaki Takeuchi il 24 Ott 2023
Hi, I'd need some help. How is it possible to read a table from an url?
The following sequence allows constructing a URL object, opening a URL connection, setting up a buffered stream reader, and reading lines (line by line):
url = java.net.URL('http://www.mathworks.com')
is = openStream(url);
isr = java.io.InputStreamReader(is);
br = java.io.BufferedReader(isr);
s = char(readLine(br)); % can be repeated
I think bufferedReader is only appropriate to read contents row by row. In case the webpage contains a table, this code works, but does not read all the elements of the table, i.e. tbody
Example (java contents)
<div class="table-responsive no-padding-top"> : start of table, displayed in Matlab (e.g. command window)
<table width=... > : table formatting, displayed in Matlab
<thead>: start of table header, displayed in Matlab
<tr>: entire row related to table header, displayed in Matlab
<th> ... </th>: 1st element of header, displayed in Matlab
<th> ... </th>: 2nd element of header, displayed in Matlab
...
</tr>, displayed in Matlab
</thead>: end of header/description of column names, displayed in Matlab
<tbody>: full table with its contents, "<tbody>" displayed in Matlab
<tr>: 1st row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 1st row, *not displayed* in Matlab
<tr>: 2nd row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 2nd row, *not displayed* in Matlab
</tbody>: end of table contents, "</tbody>" displayed in Matlab
</table>: end of table object, displayed in Matlab
How can we read the details behind a table body (tbody)?
Many thanks for your support!
Thomas

Risposta accettata

Sid Jhaveri
Sid Jhaveri il 6 Dic 2016
Modificato: KSSV il 22 Giu 2023
  1 Commento
tom3w
tom3w il 7 Dic 2016
The webread function works pretty well. Thank you Sid for your suggestions. Kr, Thomas

Accedi per commentare.

Più risposte (1)

Toshiaki Takeuchi
Toshiaki Takeuchi il 24 Ott 2023
url = "https://www.mathworks.com/help/matlab/text-files.html";
T = readtable(url,TableSelector="//TABLE[contains(.,'readtable')]", ...
ReadVariableNames=false)
T = 4×2 table
Var1 Var2 ________________ ___________________________________________ "readtable" "Create table from file" "writetable" "Write table to file" "readtimetable" "Create timetable from file (Since R2019a)" "writetimetable" "Write timetable to file (Since R2019a)"

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by