msd 5 Posted August 16, 2023 Hello Developers, I have one specific situation with PDF documents. I get one PDF document for over 2.500 invoices as one file with 2.500 pages, and I need to separate every page of that document as 2.500 separate documents for each Byer and read two positions of the current page to get information about Byer. I found a lot of documentation, but there are no samples like this. If someone has an idea or experience with these PDF problems, please post it here. Thanks in advance... Share this post Link to post
misc_bb 7 Posted August 17, 2023 I have been using PDFtk (https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) in our Delphi apps to merge documents. I haven't tried splitting but PDFtk has also the splitting function. you may want to give it a try. As for our case, we utilized the command prompt and we just issue a call Shellexecute in our app. Works just fine. 1 Share this post Link to post
Alexander Sviridenkov 354 Posted August 17, 2023 HTML Office Library can extract information from each page as HTML or plain text. Also HTML from each page can be saved back to separate PDF. 2 Share this post Link to post
Fr0sT.Brutal 899 Posted August 21, 2023 PDFium lib or an external console tool 1 Share this post Link to post
Brian Evans 104 Posted August 22, 2023 I sometimes use pdftotext + parsing with grep strings to extract specific data from PDFs with regular formatting. Has worked well in a variety of situations. pdftotext - Wikipedia Share this post Link to post
Alexander Halser 21 Posted August 25, 2023 PDFium lib does all you need. We use it with an internal application that deals with PDF pages. Add or remove pages, split pages into several smaller ones, re-combine smaller pages into one printer page. PDFium can extract text as well. Share this post Link to post