Recent

Author Topic: [Solved] How to search for value in pdf?  (Read 5333 times)

loaded

  • Hero Member
  • *****
  • Posts: 824
[Solved] How to search for value in pdf?
« on: January 27, 2022, 05:19:56 pm »
Hi All,
I will have to do a study on this subject soon.
For example, can I access the id value in the attached pdf using stream?
I would be very pleased if friends who have experience and knowledge express their ideas and suggestions. Respects.
« Last Edit: January 28, 2022, 02:07:17 pm by loaded »
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: How to search for value in pdf?
« Reply #1 on: January 27, 2022, 06:48:24 pm »
Hi!

There is the linux tool pdftotext with this simple syntax:

Code: Bash  [Select][+][-]
  1. pdftotext some.pdf some.txt
  2.  

It is from the llinux package Xpdf.  This was ported to Windows.

The output for your sample is this:

Code: Text  [Select][+][-]
  1. Çalışma Sayfası1
  2.  
  3. Id
  4.  
  5. 9960515081
  6.  
  7. Sayfa 1
  8.  
  9. L
  10.  

Winni

loaded

  • Hero Member
  • *****
  • Posts: 824
Re: How to search for value in pdf?
« Reply #2 on: January 27, 2022, 07:03:40 pm »
winni my esteemed brother,
Thank you very much for taking your precious time to reply.
I put your suggestion at the top of my to-try list.
If I can't solve it by reading binary directly or if I can't find another solution, I will definitely use it.
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: How to search for value in pdf?
« Reply #3 on: January 28, 2022, 01:17:41 pm »
hello,
it is also possible with Python4Lazarus using the module pymupdf( including mupdf library):
Script python to execute from Lazarus :
Code: Python  [Select][+][-]
  1. import sys, fitz
  2. fname = 'd:/temp/sample.pdf'  # get document filename
  3. doc = fitz.open(fname)  # open document
  4. for page in doc:  # iterate the document pages
  5.     print('==   Text  ===')
  6.     text = page.get_text("text")
  7.     print(text)
  8.     print('==   HTML  ===')
  9.     html = page.get_text("html")
  10.     print(html)
  11.  
python print is redirected in a Tmemo.

Result in attachment.

Friendly, J.P
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

loaded

  • Hero Member
  • *****
  • Posts: 824
Re: How to search for value in pdf?
« Reply #4 on: January 28, 2022, 01:35:36 pm »
Thank you very much, Jurassic Pork, for taking the time to reply.
Another research topic (one starts before another ends) has emerged for me.
I will consider your solution.
In the meantime, I did not sit idle and in my research;
I learned that the data in the pdf is encoded as FlateDecode. I'll do some more research, it sounds like it's solvable.

Thank you very much Jurassic Pork and winni, your suggestions made me have an idea about not reinventing the wheel.
I solved my problem with ghostscript.
By the way, ghostscript can easily split, merge and rotate pdf files.Respects.


« Last Edit: January 28, 2022, 02:07:00 pm by loaded »
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

paweld

  • Hero Member
  • *****
  • Posts: 970
Re: [Solved] How to search for value in pdf?
« Reply #5 on: January 28, 2022, 03:00:54 pm »
You can also use PDFium - example attached. Download libraries from: https://github.com/pvginkel/PdfiumViewer/tree/releases/2.12.0.0/Libraries/Pdfium
Best regards / Pozdrawiam
paweld

loaded

  • Hero Member
  • *****
  • Posts: 824
Re: [Solved] How to search for value in pdf?
« Reply #6 on: January 28, 2022, 04:42:10 pm »
Dear paweld, thank you very much for the reply.
Yes, I will try your suggestion at a convenient time.
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

pcurtis

  • Hero Member
  • *****
  • Posts: 951
Re: [Solved] How to search for value in pdf?
« Reply #7 on: January 28, 2022, 05:54:07 pm »
@paweld Perfect

Adobe / Foxit / ... can kiss my shinny metal ...
7Mb reader not 500Mb
« Last Edit: January 28, 2022, 06:08:55 pm by pcurtis »
Windows 10 20H2
Laz 2.2.0
FPC 3.2.2

alaa123456789

  • Sr. Member
  • ****
  • Posts: 260
  • Try your Best to learn & help others
    • youtube:
Re: [Solved] How to search for value in pdf?
« Reply #8 on: February 15, 2022, 06:09:08 pm »
this pdfium viwer not written in lazarus , we need a sample in lazarus if possible

thanks

pcurtis

  • Hero Member
  • *****
  • Posts: 951
Re: [Solved] How to search for value in pdf?
« Reply #9 on: February 15, 2022, 06:50:35 pm »
paweld supplied a sample project
Windows 10 20H2
Laz 2.2.0
FPC 3.2.2

alaa123456789

  • Sr. Member
  • ****
  • Posts: 260
  • Try your Best to learn & help others
    • youtube:
Re: [Solved] How to search for value in pdf?
« Reply #10 on: February 15, 2022, 07:12:12 pm »
Well done , thank you  :) :) :)

ginoo

  • New Member
  • *
  • Posts: 37
Re: [Solved] How to search for value in pdf?
« Reply #11 on: January 09, 2024, 11:08:18 am »
You can also use PDFium - example attached. Download libraries from: https://github.com/pvginkel/PdfiumViewer/tree/releases/2.12.0.0/Libraries/Pdfium

Hello,
do you also happen to have an example that works on linux?

paweld

  • Hero Member
  • *****
  • Posts: 970
Re: [Solved] How to search for value in pdf?
« Reply #12 on: January 10, 2024, 07:52:56 am »
@ginoo: unfortunately no
Best regards / Pozdrawiam
paweld

madref

  • Hero Member
  • *****
  • Posts: 949
  • ..... A day not Laughed is a day wasted !!
    • Nursing With Humour
Re: [Solved] How to search for value in pdf?
« Reply #13 on: January 10, 2024, 10:58:53 am »
Read this post and see if it can help you.

You treat a disease, you win, you lose.
You treat a person and I guarantee you, you win, no matter the outcome.

Lazarus 3.99 (rev main_3_99-649-ge13451a5ab) FPC 3.3.1 x86_64-darwin-cocoa
Mac OS X Monterey

ginoo

  • New Member
  • *
  • Posts: 37
Re: [Solved] How to search for value in pdf?
« Reply #14 on: January 10, 2024, 06:05:26 pm »
Read this post and see if it can help you.

Thank you for responding. I took a look but I don't think that's what I need. I wanted to be able to put text into an existing pdf. I had also tried "pdfvectorialreader" but I can't even compile it.

 

TinyPortal © 2005-2018