Recent

Author Topic: Reading pdf into a TMemo  (Read 534 times)

Nicole

  • Sr. Member
  • ****
  • Posts: 336
Reading pdf into a TMemo
« on: August 17, 2022, 10:03:47 am »
The online package manager shewed me this when I key in "pdf":
https://wiki.lazarus.freepascal.org/PowerPDF

Unfortunately, the homepage says this:
"The output is a PDF version 1.2 document. PowerPDF is not made for reading existing PDF documents. "

So my question:
How to convert a pdf-file into a TMemo?

dje

  • Full Member
  • ***
  • Posts: 107
Re: Reading pdf into a TMemo
« Reply #1 on: August 17, 2022, 10:13:31 am »
You can't. TMemo's are only for text files.
You would need to:
1) convert PDF to TXT via a comand line tool
2) Find a PDF reader component eg:  https://github.com/dinmil/PDFPreview
3) Open the PDF into an installed reader. eg: Adobe reader or qpdfview (Linux)
« Last Edit: August 17, 2022, 10:17:27 am by derek.john.evans »

paweld

  • Sr. Member
  • ****
  • Posts: 456
Re: Reading pdf into a TMemo
« Reply #2 on: August 17, 2022, 10:44:16 am »
Best regards
paweld

Nicole

  • Sr. Member
  • ****
  • Posts: 336
Re: Reading pdf into a TMemo
« Reply #3 on: August 17, 2022, 11:10:14 am »
Thank you for your answers.

I checked the links and cannot find any READING of pdf, just creating.
Did I miss it?

Yes, I know, that there are command line tools for converting. I want to avoid third party software, - as we are programmers.

Such a command line tool would be e.g. the VERY OLD version of pdf-shaper (if there is no other solution).
I have my reasons to want my own code. One of it is, that the new versions of it, trouble me.
So if anybody has or know an unit for it, it would be great.


wp

  • Hero Member
  • *****
  • Posts: 10044
Re: Reading pdf into a TMemo
« Reply #4 on: August 17, 2022, 11:18:03 am »
Download pdftotext (from https://www.xpdfreader.com/download.html, "Download the Xpdf tools"). Execute the downloaded exe which simply extracts the contained files, it does not "install" anything to Windows. In the created folder you find the tool pdftotext.exe which you can use by itself, without the other files.

Never used it before, but it seems that this syntax seems to create a filename.txt from filename.pdf (maybe it can be optimized...):
Code: [Select]
pdftotext -simple filename.pdfIn your Lazarus program execute pdf2text by means of the RunCommand procedure (or TProcess for more control - see https://wiki.freepascal.org/Executing_External_Programs#Reading_large_output).

Finally load the output file (filename.txt) into the memo.

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   s: String;
  4. begin
  5.   RunCommand(Application.Location + 'pdftotext.exe', ['-simple', FilenameEdit1.Filename], s);
  6.   Memo1.Lines.LoadFromFile(ChangeFileExt(FileNameEdit1.FileName, '.txt'));
  7. end;
« Last Edit: August 17, 2022, 11:24:39 am by wp »

Zvoni

  • Hero Member
  • *****
  • Posts: 1335
Re: Reading pdf into a TMemo
« Reply #5 on: August 17, 2022, 12:29:35 pm »
You can also use the PDFium: https://forum.lazarus.freepascal.org/index.php/topic,58056.msg432524.html#msg432524
The Demo-Project for pdfium actually READS a PDF into a Memo.....
I tested it with one of my own PDF's
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

paweld

  • Sr. Member
  • ****
  • Posts: 456
Re: Reading pdf into a TMemo
« Reply #6 on: August 17, 2022, 12:33:07 pm »
I modified the project so that it does not show the pdf, but downloads the contents of the selected page to TMemo.You must download the library from here:https://github.com/pvginkel/PdfiumViewer/tree/releases/2.12.0.0/Libraries/Pdfium
Best regards
paweld

Zvoni

  • Hero Member
  • *****
  • Posts: 1335
Re: Reading pdf into a TMemo
« Reply #7 on: August 17, 2022, 12:37:42 pm »
You can also use the PDFium: https://forum.lazarus.freepascal.org/index.php/topic,58056.msg432524.html#msg432524
The Demo-Project for pdfium actually READS a PDF into a Memo.....
I tested it with one of my own PDF's

I forgot: This obviously only works with "converted to PDF"-Documents.
A PDF from a Scanner on the other hand.....
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

paweld

  • Sr. Member
  • ****
  • Posts: 456
Re: Reading pdf into a TMemo
« Reply #8 on: August 17, 2022, 12:59:58 pm »
in the case of a pdf from a scanner, then each page has an embedded image, so you need to use OCR
Best regards
paweld

Nicole

  • Sr. Member
  • ****
  • Posts: 336
[solved] Re: Reading pdf into a TMemo
« Reply #9 on: August 17, 2022, 08:16:55 pm »
Thank you so much for the informative answers.  :-*
Ant my special thank to WP, which project just loaded my pdf-link into my TMemo.

Whoever has the luck to find the link to this thread in the future, hint: pls be patient, the process of downloading takes its time.

 

TinyPortal © 2005-2018