Extracting data from pdf files
Mostra commenti meno recenti
Hi,
I have around 300 pdf files with 19 pages each. I want to extract from each of them a fraction of a table on page 4 in order to build a research data set. Is i possible to do so using matlab? if so,which toolboxes and functions I need. I have matlab 2013a.
Risposta accettata
Più risposte (1)
Christopher Creutzig
il 27 Apr 2021
0 voti
JFTR, since R2017b, extractFileText('filename.pdf','Pages',4) from Text Analytics Toolbox gives you the text on ("physical") page 4 of the PDF, from which you can then extract the parts you need with string operations (extractBetween, regexp, etc.).
Categorie
Scopri di più su Characters and Strings in Centro assistenza e File Exchange
Prodotti
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!