Error while using readPDFFormData to extract data from online pdf files
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Ana Egatz-Gomez
il 21 Dic 2023
Commentato: Ana Egatz-Gomez
il 6 Mar 2024
Hi, I have two related questions.
First, when I use the command readPDFFormData with an online pdf, I get an error message. How should I write the URL so it works?
Second, is it possible to extract data from a list of links to pdf files on a website, just not one by one? (This website https://www.ncdoi.gov/consumers/medicare-and-seniors-health-insurance-information-program-shiip/medicare-advantage-medicare-health-plans-part-c#MedicareAdvantageLandscapesbyCounty2024-2398)
Any help will be greatly appreciated.
filename = "https://www.ncdoi.com/SHIIPCurrentYear/Documents/MAL%20by%20County/2024%20MAPD%20Pender%20County.pdf";
data = readPDFFormData(filename);
0 Commenti
Risposta accettata
Anton Kogios
il 22 Dic 2023
I was not able to get the URL to work directly either (not sure why since I'm pretty sure reading online images such as PNG/JPG works...), but here is a workaround (it just downloads the PDF first):
filenameOnline = "https://www.ncdoi.com/SHIIPCurrentYear/Documents/MAL%20by%20County/2024%20MAPD%20Pender%20County.pdf";
filenameLocal = "test.pdf"; % can set to custom directory
websave(filenameLocal,filenameOnline);
formData = readPDFFormData(filenameLocal)
fileText = extractFileText(filenameLocal); % since readPDFFormData returns an empty struct, this is just to make sure we can read the PDF
As for your second question, I can't seem to access the PDFs on the website you mentioned (I think it is because I'm in a different country), but you should be able to just use a for loop. You can also look into using sprintf if the URLs have a repetitive naming system, which I've also demonstrated. Something like:
filenamesOnline = ["url1.pdf";
"url2.pdf";
"url3.pdf"];
for i = 1:length(filenamesOnline)
filenameLocal = sprintf('test%i.pdf',i);
websave(filenameLocal,filenameOnline(i));
formData = readPDFFormData(filenameLocal)
end
I hope this helps and you are able to get it to work for you!
Più risposte (0)
Vedere anche
Categorie
Scopri di più su Startup and Shutdown in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!