extract defined range of pages form a PDF to create multiple PDF files

18 visualizzazioni (ultimi 30 giorni)
Let say i have a 1000 pages PDF file called ALL.pdf.
I want to create multiple PDF files taken from ALL.pdf following a defined order in an excel file, eg:
File name First page Last page
John 1 10
Luke 11 15
Matt 16 22
...... ...... ......
Adam 996 1000
So matlab should extract the first 10 pages from ALL.pdf and generate JOHN.pdf, with pages from 11 to 15 generate the file LUKE.pdf and so on to ADAMS.pdf with the last 5 pages of ALL.pdf.
My ignorance is related the command usefull to open and manipulate PDF files (if i had to extract defined sheets from an .xlsx there were no problem at all)
I've searched everywhere but i did not find anything on how to do it.
Any advice?
Thanks

Risposta accettata

Rahul
Rahul il 7 Nov 2024 alle 7:48
In order to achieve the desired result of extracting text from particular ranges of pages from a large 'pdf' file and then saving them as separate 'pdf' files, you can consider using 'extractFileText' function which provides a property called 'Pages' where the range of pages required can be mentioned as an array. Here is an example:
pages = 1:10
str = extractFileText("ALL.pdf", 'Pages', pages);
% This would store the text content of the first 10 pages in 'str'
Then you can use functions like 'Document' and 'Paragraph' to convert 'str' obtained to a new 'pdf' file like 'JOHN.pdf' as mentioned in the question. Here is an example:
import mlreportgen.dom.*;
doc = Document('JOHN', 'pdf');
% Adding the 'str' as a paragraph to the document
p = Paragraph(str);
append(doc, p);
close(doc);
You can refer to the following MathWorks documentations to know more about these functions:
Hope this helps! Thanks.

Più risposte (0)

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by