HTML file scraping for Fields in a Table

1 visualizzazione (ultimi 30 giorni)

v k il 15 Giu 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/548622-html-file-scraping-for-fields-in-a-table

Commentato: v k il 16 Giu 2020

clientData.txt

The HTML file that I am working on, is a long one and contains particulars as given in the attached text file. Although the structure is simple and repetitive, due to the large number of characters in between the data fields, I am having hard time in scraping the required data. The objective is to get a two-column excel spreadsheet containing Name in the first column and Email in the second column. How to obtain these required fields in the xlsx file ? Thanks.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Risposte (1)

Sean de Wolski il 15 Giu 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/548622-html-file-scraping-for-fields-in-a-table#answer_451512

Start playing with htmlTree in the Text analytics toolbox.

t = htmlTree(fileread('clientdata.txt'))
t.findElement('TD').extractHTMLText

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

v k il 16 Giu 2020

How to extract the fields "Name " and "Email " after this ?

Accedi per commentare.

Accedi per rispondere a questa domanda.

Categorie

AI, Data Science, and Statistics Text Analytics Toolbox Text Data Preparation

Scopri di più su Text Data Preparation in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by