Recent

Author Topic: USe FPC or the command line?  (Read 3431 times)

brian_m

  • Newbie
  • Posts: 5
USe FPC or the command line?
« on: March 12, 2018, 12:56:50 pm »
Hi all,

A question for those with experience of using the PDF routines in FreePascal. I have something like 150,000 page scans as JPEG images, taken from a CD-ROM collection of magazine back numbers. Unfortunately the viewer software with the images isn't compatible with my Linux box. I want to take the images and reconstitute the magazines as PDFs (the filenames are coded, that part's not difficult) and I have a choice of two ways to do it, either I run img2pdf on each file (Linux box, BTW) and then use Ghostscript to concatenate the individual PDFs, or I try to write something in FreePascal to do the job.

I have absolutely zero experience with using the PDF routines in FreePascal, but the output from img2pdf doesn't seem to be too great so far as the quality is concerned, the output looks better to me if I use OpenOffice, insert the JPEG and then generate the PDF, but obviously the time it would take to do them all manually is prohibitive.

Any comments on the viability of going the FreePascal route would be appreciated.

Thanks!

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: USe FPC or the command line?
« Reply #1 on: March 12, 2018, 05:04:16 pm »
It is a batch job anyway. Does not matter what converter you use: there are no "instant" solutions for 150.000 pages.....
Specialize a type, not a var.

rvk

  • Hero Member
  • *****
  • Posts: 6111
Re: USe FPC or the command line?
« Reply #2 on: March 12, 2018, 05:16:01 pm »
Depending on your img2pdf:
Code: [Select]
img2pdf Scan00*.jpg newpdf.pdf
I think scripting in Linux (with the right tools) would be much faster than creating a FPC program.

(I do wonder how Linux is going to react with expanding 150.000 image-names on the commandline :D)

You do know that 150.000 images of about 2MB each will result in a PDF file of about 300 gigabytes?
I also wonder which reader you are going to use to read that :D
« Last Edit: March 12, 2018, 05:18:57 pm by rvk »

brian_m

  • Newbie
  • Posts: 5
Re: USe FPC or the command line?
« Reply #3 on: March 12, 2018, 06:15:01 pm »
Depending on your img2pdf:
Code: [Select]
img2pdf Scan00*.jpg newpdf.pdf
I think scripting in Linux (with the right tools) would be much faster than creating a FPC program.

(I do wonder how Linux is going to react with expanding 150.000 image-names on the commandline :D)

You do know that 150.000 images of about 2MB each will result in a PDF file of about 300 gigabytes?
I also wonder which reader you are going to use to read that :D


I think you're making a couple of completely unwarranted assumptions here...

Firstly, the image sizes are all less than a tenth of your estimated 2MB each.

Secondly, nowhere did I say that I was going to combine them all into a single PDF. All I intend combining is the pages from each monthly issue, so as to reproduce the individual magazines in PDF form.

FYI, they are in one subdirectory per month.

I don't anticipate okular having any trouble whatsoever with the resulting PDFs.

The reason I asked the question is that, at least to my eye, generating a PDF via the scripting solution produces somewhat inferior quality compared to producing the PDF by inserting the images into OpenOffice and generating the PDF that way. That's why I wondered about the viability of using the FreePascal tools.



marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: USe FPC or the command line?
« Reply #4 on: March 12, 2018, 06:19:31 pm »
It is a batch job anyway. Does not matter what converter you use: there are no "instant" solutions for 150.000 pages.....

Ever heard of LaTEX ?   Generate a  latex document that is nothing but a series of includes of jpgs. You can even use FPC to scan the jpgs for dimensions and adjust the parameters to the scaling dynamically, split up in multiple files etc.


Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: USe FPC or the command line?
« Reply #5 on: March 12, 2018, 06:26:53 pm »
It is a batch job anyway. Does not matter what converter you use: there are no "instant" solutions for 150.000 pages.....

Ever heard of LaTEX ?   Generate a  latex document that is nothing but a series of includes of jpgs. You can even use FPC to scan the jpgs for dimensions and adjust the parameters to the scaling dynamically, split up in multiple files etc.
Still, you need patience...
You know I know LaTEX...  :P
Specialize a type, not a var.

rvk

  • Hero Member
  • *****
  • Posts: 6111
Re: USe FPC or the command line?
« Reply #6 on: March 12, 2018, 06:27:22 pm »
The reason I asked the question is that, at least to my eye, generating a PDF via the scripting solution produces somewhat inferior quality compared to producing the PDF by inserting the images into OpenOffice and generating the PDF that way. That's why I wondered about the viability of using the FreePascal tools.
Generating a PDF with a command-line utility will result in just as good (if not better) PDF as doing it via OpenOffice. The source material is the same. It all depends on the options you will use (and if OpenOffice somehow enhances the image). Some commands for inspiration.

I have no experience with the FreePascal tools but you can try.

dinmil

  • New Member
  • *
  • Posts: 45
Re: USe FPC or the command line?
« Reply #7 on: March 12, 2018, 08:59:34 pm »
Check this project. Maybe it can help.
https://github.com/dinmil/PDFPreview/blob/master/README.md

brian_m

  • Newbie
  • Posts: 5
Re: USe FPC or the command line?
« Reply #8 on: March 12, 2018, 09:11:45 pm »
The reason I asked the question is that, at least to my eye, generating a PDF via the scripting solution produces somewhat inferior quality compared to producing the PDF by inserting the images into OpenOffice and generating the PDF that way. That's why I wondered about the viability of using the FreePascal tools.
Generating a PDF with a command-line utility will result in just as good (if not better) PDF as doing it via OpenOffice. The source material is the same. It all depends on the options you will use (and if OpenOffice somehow enhances the image). Some commands for inspiration.

I have no experience with the FreePascal tools but you can try.

I can only repeat that, given the way I tried so far, the output appeared to be better quality going via OpenOffice. There are certainly some alternatives in the links you gave, so thanks for those, I will give each of them a try on the same month's worth of scans and see what comes out the far end.

brian_m

  • Newbie
  • Posts: 5
Re: USe FPC or the command line?
« Reply #9 on: March 12, 2018, 09:13:44 pm »
It is a batch job anyway. Does not matter what converter you use: there are no "instant" solutions for 150.000 pages.....

Ever heard of LaTEX ?   Generate a  latex document that is nothing but a series of includes of jpgs. You can even use FPC to scan the jpgs for dimensions and adjust the parameters to the scaling dynamically, split up in multiple files etc.

Aha! My wife used to spend large parts of her working day with LaTEX. I just wonder whether she can remember any of it (to be fair, that's almost 20 years ago). Thanks for the hint!

 

TinyPortal © 2005-2018