Recent

Author Topic: [Solved] How to decode streams of type FlateDecode?  (Read 830 times)

loaded

  • Hero Member
  • *****
  • Posts: 825
[Solved] How to decode streams of type FlateDecode?
« on: September 27, 2022, 09:30:41 am »
Hi All,
There are streams encoded as FlateDecode in a pdf file I have. Is there a method to solve these flows in the simplest way?
I would be glad if you share your experience and suggestions. Respects.
« Last Edit: September 27, 2022, 03:32:25 pm by loaded »
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2064
  • Fifty shades of code.
    • Delphi & FreePascal
Re: How to decode streams of type FlateDecode?
« Reply #1 on: September 27, 2022, 09:52:38 am »
Use any Pdf library to do, would be fastest solution. With ZLib you should also be able to "unpack" such stream.
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

loaded

  • Hero Member
  • *****
  • Posts: 825
Re: How to decode streams of type FlateDecode?
« Reply #2 on: September 27, 2022, 10:44:08 am »
Thank you so much KodeZwerg for the answer.
I use ghostscript for my pdf processing. It works great.
But I wanted to do something myself.
Yes Zlib might be the solution, I'll try that.
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: How to decode streams of type FlateDecode?
« Reply #3 on: September 27, 2022, 11:26:15 am »
 seems so: https://gist.github.com/averagesecurityguy/ba8d9ed3c59c1deffbd1390dafa5a3c2

note the   

Code: PHP  [Select][+][-]
  1.   print(zlib.decompress(s))

loaded

  • Hero Member
  • *****
  • Posts: 825
Re: How to decode streams of type FlateDecode?
« Reply #4 on: September 27, 2022, 03:32:13 pm »
Thank you so much marcov for the answer.
seems so: https://gist.github.com/averagesecurityguy/ba8d9ed3c59c1deffbd1390dafa5a3c2

https://forum.lazarus.freepascal.org/index.php/topic,33009.msg213157.html#msg213157
With the help of your suggestion and the codes in the link, my problem was solved.

But now a problem has arisen.
Pdf files contain both text and stream objects.
What's the easiest way to extract these stream objects from texts?
The python code in the link you posted can do this very simply.
Is this possible in Lazarus?

Check out  loaded on Strava
https://www.strava.com/athletes/109391137

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2064
  • Fifty shades of code.
    • Delphi & FreePascal
Re: [Solved] How to decode streams of type FlateDecode?
« Reply #5 on: September 27, 2022, 03:41:17 pm »
I do not know about how other languages do work.
I more rely to the technical aspect of a file structure like pdf.
https://resources.infosecinstitute.com/topic/pdf-file-format-basic-structure/
Above just one of millions web pages that describe the structure.
When you have read that, you should know the tricks  O:-)
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

loaded

  • Hero Member
  • *****
  • Posts: 825
Re: [Solved] How to decode streams of type FlateDecode?
« Reply #6 on: September 27, 2022, 05:51:01 pm »
What I want to do in my current situation;
It was more of a method of reading a file that contained both binary and text-only, rather than learning the pdf file structure. This is the first time I have encountered such a situation. Normally, the files would be either text or directly binary. Both are mixed here.

Code: Pascal  [Select][+][-]
  1. %PDF-1.7
  2. 4 0 obj
  3. (Identity)
  4. endobj
  5. 5 0 obj
  6. (Adobe)
  7. endobj
  8. 8 0 obj
  9. <<
  10. /Filter /FlateDecode
  11. /Length 45603
  12. /Length1 103528
  13. /Type /Stream
  14. >>
  15. stream
  16. xœì½|”Uö7~îó<S2%™ôI’Iïô–À„$tHL$š6@HB@l`WDEE,°Šem€†¢PAe]ö¶êZ±¬ŠmÑuWɼß{ŸgB&€ëoß÷÷û¼ïÿŸ¹ÜsÎíçž{ι…IBŒˆÂ r—TNülÁ¶EÄVì&ŠÛ=¹¤tÒÂıg¾M´u
  17. ...
  18. endstream

like this  ;D

Still, thank you very much for your help. It was very important for me to even come this far. Respects.
Check out  loaded on Strava
https://www.strava.com/athletes/109391137

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2064
  • Fifty shades of code.
    • Delphi & FreePascal
Re: [Solved] How to decode streams of type FlateDecode?
« Reply #7 on: September 27, 2022, 06:25:42 pm »
Your approach might work while doing it the official way works all the time.
Read the end of the file first to get the position to the xref, from there (the xref table) you have easy access to every content of that pdf.
I would not bother by trying a straightforward variant but it is a lot of wasted energy.
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

 

TinyPortal © 2005-2018