Recent

Author Topic: [Solved] How to search for value in pdf?  (Read 3399 times)

loaded

  • Hero Member
  • *****
  • Posts: 569
[Solved] How to search for value in pdf?
« on: January 27, 2022, 05:19:56 pm »
Hi All,
I will have to do a study on this subject soon.
For example, can I access the id value in the attached pdf using stream?
I would be very pleased if friends who have experience and knowledge express their ideas and suggestions. Respects.
« Last Edit: January 28, 2022, 02:07:17 pm by loaded »
If Ide=Lazarus 2.0.10 32 Bit and Os=Win 10 Home 64 Bit then Get up and do something useful! Because God is the helper of those who start again;

winni

  • Hero Member
  • *****
  • Posts: 3053
Re: How to search for value in pdf?
« Reply #1 on: January 27, 2022, 06:48:24 pm »
Hi!

There is the linux tool pdftotext with this simple syntax:

Code: Bash  [Select][+][-]
  1. pdftotext some.pdf some.txt
  2.  

It is from the llinux package Xpdf.  This was ported to Windows.

The output for your sample is this:

Code: Text  [Select][+][-]
  1. Çalışma Sayfası1
  2.  
  3. Id
  4.  
  5. 9960515081
  6.  
  7. Sayfa 1
  8.  
  9. L
  10.  

Winni

loaded

  • Hero Member
  • *****
  • Posts: 569
Re: How to search for value in pdf?
« Reply #2 on: January 27, 2022, 07:03:40 pm »
winni my esteemed brother,
Thank you very much for taking your precious time to reply.
I put your suggestion at the top of my to-try list.
If I can't solve it by reading binary directly or if I can't find another solution, I will definitely use it.
If Ide=Lazarus 2.0.10 32 Bit and Os=Win 10 Home 64 Bit then Get up and do something useful! Because God is the helper of those who start again;

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1115
Re: How to search for value in pdf?
« Reply #3 on: January 28, 2022, 01:17:41 pm »
hello,
it is also possible with Python4Lazarus using the module pymupdf( including mupdf library):
Script python to execute from Lazarus :
Code: Python  [Select][+][-]
  1. import sys, fitz
  2. fname = 'd:/temp/sample.pdf'  # get document filename
  3. doc = fitz.open(fname)  # open document
  4. for page in doc:  # iterate the document pages
  5.     print('==   Text  ===')
  6.     text = page.get_text("text")
  7.     print(text)
  8.     print('==   HTML  ===')
  9.     html = page.get_text("html")
  10.     print(html)
  11.  
python print is redirected in a Tmemo.

Result in attachment.

Friendly, J.P
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

loaded

  • Hero Member
  • *****
  • Posts: 569
Re: How to search for value in pdf?
« Reply #4 on: January 28, 2022, 01:35:36 pm »
Thank you very much, Jurassic Pork, for taking the time to reply.
Another research topic (one starts before another ends) has emerged for me.
I will consider your solution.
In the meantime, I did not sit idle and in my research;
I learned that the data in the pdf is encoded as FlateDecode. I'll do some more research, it sounds like it's solvable.

Thank you very much Jurassic Pork and winni, your suggestions made me have an idea about not reinventing the wheel.
I solved my problem with ghostscript.
By the way, ghostscript can easily split, merge and rotate pdf files.Respects.


« Last Edit: January 28, 2022, 02:07:00 pm by loaded »
If Ide=Lazarus 2.0.10 32 Bit and Os=Win 10 Home 64 Bit then Get up and do something useful! Because God is the helper of those who start again;

paweld

  • Sr. Member
  • ****
  • Posts: 369
Re: [Solved] How to search for value in pdf?
« Reply #5 on: January 28, 2022, 03:00:54 pm »
You can also use PDFium - example attached. Download libraries from: https://github.com/pvginkel/PdfiumViewer/tree/releases/2.12.0.0/Libraries/Pdfium
Best regards
paweld

loaded

  • Hero Member
  • *****
  • Posts: 569
Re: [Solved] How to search for value in pdf?
« Reply #6 on: January 28, 2022, 04:42:10 pm »
Dear paweld, thank you very much for the reply.
Yes, I will try your suggestion at a convenient time.
If Ide=Lazarus 2.0.10 32 Bit and Os=Win 10 Home 64 Bit then Get up and do something useful! Because God is the helper of those who start again;

pcurtis

  • Hero Member
  • *****
  • Posts: 939
Re: [Solved] How to search for value in pdf?
« Reply #7 on: January 28, 2022, 05:54:07 pm »
@paweld Perfect

Adobe / Foxit / ... can kiss my shinny metal ...
7Mb reader not 500Mb
« Last Edit: January 28, 2022, 06:08:55 pm by pcurtis »
Windows 10 20H2
Laz 2.2.0
FPC 3.2.2

alaa123456789

  • Full Member
  • ***
  • Posts: 204
Re: [Solved] How to search for value in pdf?
« Reply #8 on: February 15, 2022, 06:09:08 pm »
this pdfium viwer not written in lazarus , we need a sample in lazarus if possible

thanks

pcurtis

  • Hero Member
  • *****
  • Posts: 939
Re: [Solved] How to search for value in pdf?
« Reply #9 on: February 15, 2022, 06:50:35 pm »
paweld supplied a sample project
Windows 10 20H2
Laz 2.2.0
FPC 3.2.2

alaa123456789

  • Full Member
  • ***
  • Posts: 204
Re: [Solved] How to search for value in pdf?
« Reply #10 on: February 15, 2022, 07:12:12 pm »
Well done , thank you  :) :) :)

 

TinyPortal © 2005-2018