Recent

Author Topic: Reading pdf into a TMemo  (Read 1172 times)

Nicole

  • Hero Member
  • *****
  • Posts: 970
Reading pdf into a TMemo
« on: August 17, 2022, 10:03:47 am »
The online package manager shewed me this when I key in "pdf":
https://wiki.lazarus.freepascal.org/PowerPDF

Unfortunately, the homepage says this:
"The output is a PDF version 1.2 document. PowerPDF is not made for reading existing PDF documents. "

So my question:
How to convert a pdf-file into a TMemo?

dje

  • Full Member
  • ***
  • Posts: 134
Re: Reading pdf into a TMemo
« Reply #1 on: August 17, 2022, 10:13:31 am »
You can't. TMemo's are only for text files.
You would need to:
1) convert PDF to TXT via a comand line tool
2) Find a PDF reader component eg:  https://github.com/dinmil/PDFPreview
3) Open the PDF into an installed reader. eg: Adobe reader or qpdfview (Linux)
« Last Edit: August 17, 2022, 10:17:27 am by derek.john.evans »

paweld

  • Hero Member
  • *****
  • Posts: 966
Re: Reading pdf into a TMemo
« Reply #2 on: August 17, 2022, 10:44:16 am »
Best regards / Pozdrawiam
paweld

Nicole

  • Hero Member
  • *****
  • Posts: 970
Re: Reading pdf into a TMemo
« Reply #3 on: August 17, 2022, 11:10:14 am »
Thank you for your answers.

I checked the links and cannot find any READING of pdf, just creating.
Did I miss it?

Yes, I know, that there are command line tools for converting. I want to avoid third party software, - as we are programmers.

Such a command line tool would be e.g. the VERY OLD version of pdf-shaper (if there is no other solution).
I have my reasons to want my own code. One of it is, that the new versions of it, trouble me.
So if anybody has or know an unit for it, it would be great.


wp

  • Hero Member
  • *****
  • Posts: 11830
Re: Reading pdf into a TMemo
« Reply #4 on: August 17, 2022, 11:18:03 am »
Download pdftotext (from https://www.xpdfreader.com/download.html, "Download the Xpdf tools"). Execute the downloaded exe which simply extracts the contained files, it does not "install" anything to Windows. In the created folder you find the tool pdftotext.exe which you can use by itself, without the other files.

Never used it before, but it seems that this syntax seems to create a filename.txt from filename.pdf (maybe it can be optimized...):
Code: [Select]
pdftotext -simple filename.pdfIn your Lazarus program execute pdf2text by means of the RunCommand procedure (or TProcess for more control - see https://wiki.freepascal.org/Executing_External_Programs#Reading_large_output).

Finally load the output file (filename.txt) into the memo.

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   s: String;
  4. begin
  5.   RunCommand(Application.Location + 'pdftotext.exe', ['-simple', FilenameEdit1.Filename], s);
  6.   Memo1.Lines.LoadFromFile(ChangeFileExt(FileNameEdit1.FileName, '.txt'));
  7. end;
« Last Edit: August 17, 2022, 11:24:39 am by wp »

Zvoni

  • Hero Member
  • *****
  • Posts: 2300
Re: Reading pdf into a TMemo
« Reply #5 on: August 17, 2022, 12:29:35 pm »
You can also use the PDFium: https://forum.lazarus.freepascal.org/index.php/topic,58056.msg432524.html#msg432524
The Demo-Project for pdfium actually READS a PDF into a Memo.....
I tested it with one of my own PDF's
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

paweld

  • Hero Member
  • *****
  • Posts: 966
Re: Reading pdf into a TMemo
« Reply #6 on: August 17, 2022, 12:33:07 pm »
I modified the project so that it does not show the pdf, but downloads the contents of the selected page to TMemo.You must download the library from here:https://github.com/pvginkel/PdfiumViewer/tree/releases/2.12.0.0/Libraries/Pdfium
Best regards / Pozdrawiam
paweld

Zvoni

  • Hero Member
  • *****
  • Posts: 2300
Re: Reading pdf into a TMemo
« Reply #7 on: August 17, 2022, 12:37:42 pm »
You can also use the PDFium: https://forum.lazarus.freepascal.org/index.php/topic,58056.msg432524.html#msg432524
The Demo-Project for pdfium actually READS a PDF into a Memo.....
I tested it with one of my own PDF's

I forgot: This obviously only works with "converted to PDF"-Documents.
A PDF from a Scanner on the other hand.....
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

paweld

  • Hero Member
  • *****
  • Posts: 966
Re: Reading pdf into a TMemo
« Reply #8 on: August 17, 2022, 12:59:58 pm »
in the case of a pdf from a scanner, then each page has an embedded image, so you need to use OCR
Best regards / Pozdrawiam
paweld

Nicole

  • Hero Member
  • *****
  • Posts: 970
[solved] Re: Reading pdf into a TMemo
« Reply #9 on: August 17, 2022, 08:16:55 pm »
Thank you so much for the informative answers.  :-*
Ant my special thank to WP, which project just loaded my pdf-link into my TMemo.

Whoever has the luck to find the link to this thread in the future, hint: pls be patient, the process of downloading takes its time.

 

TinyPortal © 2005-2018