Recent

Author Topic: lcl-pdf is not UTF8?  (Read 1186 times)

trapanator

  • New member
  • *
  • Posts: 9
lcl-pdf is not UTF8?
« on: April 11, 2025, 09:09:03 am »
Hello, in this simple program I generate a pdf using lcl-pdf.
I wrote a line Page.WriteText(20, 30, 'è'); but in the output pdf it appears as image below.

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX}
  7.   cthreads,
  8.   {$ENDIF}
  9.   Classes,
  10.   fppdf, fpttf;
  11.  
  12. var
  13.   FontID, FontBoldID: Integer;
  14.   Document: TPDFDocument;
  15.   Section: TPDFSection;
  16.   Page: TPDFPage;
  17.  
  18.  
  19. begin
  20.   Document := TPDFDocument.Create(nil);
  21.   Document.FontDirectory := 'C:\Windows\Fonts';
  22.   Document.Options := Document.Options + [poPageOriginAtTop, poNoEmbeddedFonts];
  23.   Document.StartDocument;
  24.   FontID := Document.AddFont('arial.ttf', 'Arial');
  25.   FontBoldID := Document.AddFont('arialbd.ttf', 'Arial Bold');
  26.  
  27.   Section := Document.Sections.AddSection;
  28.  
  29.   Page := Document.Pages.AddPage;
  30.   Section.AddPage(Page);
  31.  
  32.   Page.SetFont(FontID, 11);
  33.   Page.WriteText(20, 20, 'This is normal text');
  34.  
  35.   Page.SetFont(FontBoldID, 11);
  36.   Page.WriteText(20, 30, 'è');
  37.  
  38.   Document.SaveToFile('output.pdf');
  39.  
  40. end.                    

Generates this attached image. See that the 'è' character is not rendered correctly?


trapanator

  • New member
  • *
  • Posts: 9
Re: lcl-pdf is not UTF8?
« Reply #1 on: April 11, 2025, 09:36:33 am »
I rewrote that line using the UTF8Decode:

Code: Pascal  [Select][+][-]
  1. Page.WriteText(20, 30, UTF8Decode('è'));

So, I need to do this?

Nimbus

  • New Member
  • *
  • Posts: 45
Re: lcl-pdf is not UTF8?
« Reply #2 on: April 11, 2025, 10:13:34 am »
I guess one thing to try is to tell the compiler that your code literals (and therefore the 'è') is in UTF8.

Code: Pascal  [Select][+][-]
  1. {$codepage UTF8}

cdbc

  • Hero Member
  • *****
  • Posts: 2150
    • http://www.cdbc.dk
Re: lcl-pdf is not UTF8?
« Reply #3 on: April 11, 2025, 10:35:44 am »
Hi
...or try another font, some fonts cannot write e.g.: 'æ ø å ö ü ñ é ß'
I'm Danish and have to use these blasted 'æ ø å Æ Ø Å', that involves choosing the right font...!
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 3.6 up until Jan 2024 from then on it's both above &: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 4.99

wp

  • Hero Member
  • *****
  • Posts: 12800
Re: lcl-pdf is not UTF8?
« Reply #4 on: April 11, 2025, 11:04:07 am »
Literal strings still are one of the big FPC mysteries to me. Define a variable s: String, and assign the string to be printed to it, and it will work. But you also must activate the UTF8 widestringmanager by adding LazUTF8 to uses (which requires a dependency on the package LazUtils in the CLI program; in a GUI application this is done automatically since LCL is in the requirements which contains LazUtils).
Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX}
  7.   cthreads,
  8.   {$ENDIF}
  9.   Classes,
  10.   fppdf, fpttf, lazutf8;
  11.  
  12. var
  13.   FontID, FontBoldID: Integer;
  14.   Document: TPDFDocument;
  15.   Section: TPDFSection;
  16.   Page: TPDFPage;
  17.   s: String;
  18.  
  19.  
  20. begin
  21.   Document := TPDFDocument.Create(nil);
  22.   try
  23.     Document.FontDirectory := 'C:\Windows\Fonts';
  24.     Document.Options := Document.Options + [poPageOriginAtTop, poNoEmbeddedFonts];
  25.     Document.StartDocument;
  26.     FontID := Document.AddFont('arial.ttf', 'Arial');
  27.     FontBoldID := Document.AddFont('arialbd.ttf', 'Arial Bold');
  28.  
  29.     Section := Document.Sections.AddSection;
  30.  
  31.     Page := Document.Pages.AddPage;
  32.     Section.AddPage(Page);
  33.  
  34.     Page.SetFont(FontID, 11);
  35.     Page.WriteText(20, 20, 'This is normal text');
  36.  
  37.     Page.SetFont(FontBoldID, 11);
  38.     s := 'äöü';
  39.     Page.WriteText(20, 30, s);
  40.  
  41.     Document.SaveToFile('output.pdf');
  42.   finally
  43.     Document.Free;
  44.   end;  

cdbc

  • Hero Member
  • *****
  • Posts: 2150
    • http://www.cdbc.dk
Re: lcl-pdf is not UTF8?
« Reply #5 on: April 11, 2025, 11:30:56 am »
Hi
@Werner: Yup, that's annoying with the dependency on LAZutils in a console application...
So much so, that I've actually forked most of it and made it non-dependent on LAZ-stuff. I just include the forked, trimmed & slimmed directory in my uses...
That's just one way to go about it  :)
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 3.6 up until Jan 2024 from then on it's both above &: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 4.99

trapanator

  • New member
  • *
  • Posts: 9
Re: lcl-pdf is not UTF8?
« Reply #6 on: April 11, 2025, 11:32:37 am »
Hi
...or try another font, some fonts cannot write e.g.: 'æ ø å ö ü ñ é ß'
I'm Danish and have to use these blasted 'æ ø å Æ Ø Å', that involves choosing the right font...!
Regards Benny

arial.ttf contains all normal latin and accented à è ì ò ù characters.

trapanator

  • New member
  • *
  • Posts: 9
Re: lcl-pdf is not UTF8?
« Reply #7 on: April 11, 2025, 11:35:57 am »
Literal strings still are one of the big FPC mysteries to me. Define a variable s: String, and assign the string to be ...

Thank you for the hint. This code works:

Code: Pascal  [Select][+][-]
  1. s := 'è';
  2. Page.WriteText (10, 10, s);

but this not:

Code: Pascal  [Select][+][-]
  1. Page.WriteText (10, 10, 'è');

LV

  • Sr. Member
  • ****
  • Posts: 272
Re: lcl-pdf is not UTF8?
« Reply #8 on: April 11, 2025, 04:04:18 pm »
In addition to using the {$codepage UTF8} directive, I fully agree with

Hi
...or try another font, some fonts cannot write e.g.: 'æ ø å ö ü ñ é ß'
I'm Danish and have to use these blasted 'æ ø å Æ Ø Å', that involves choosing the right font...!
Regards Benny

I have attached a screenshot of the generated PDF using the {$codepage UTF8} directive and different fonts.

wp

  • Hero Member
  • *****
  • Posts: 12800
Re: lcl-pdf is not UTF8?
« Reply #9 on: April 11, 2025, 04:20:51 pm »
@Werner: Yup, that's annoying with the dependency on LAZutils in a console application...
Be careful: "LazUtils" is not "LCLUtils", in other word: it is only a prerequisite of the LCL!

cdbc

  • Hero Member
  • *****
  • Posts: 2150
    • http://www.cdbc.dk
Re: lcl-pdf is not UTF8?
« Reply #10 on: April 11, 2025, 04:44:26 pm »
Hi
Quote
it is only a prerequisite of the LCL!
If I include lazutf8, in my uses clause, in a console app, then that becomes dependent too...
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 3.6 up until Jan 2024 from then on it's both above &: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 4.99

TRon

  • Hero Member
  • *****
  • Posts: 4371
Re: lcl-pdf is not UTF8?
« Reply #11 on: April 11, 2025, 04:51:06 pm »
Be careful: "LazUtils" is not "LCLUtils", in other word: it is only a prerequisite of the LCL!
Which of the two did you mean by that ? f.i. fpSpreadsheet depends on LazUtils (so it automatically becomes a dependency). It is thios kind of stuff that makes it very annoying to only use FPC (Just like cdbc I created my own FPC package for that)
Today is tomorrow's yesterday.

wp

  • Hero Member
  • *****
  • Posts: 12800
Re: lcl-pdf is not UTF8?
« Reply #12 on: April 11, 2025, 05:32:24 pm »
LazUtils is an ordinary Lazarus package which does not require the LCL. If you want to use it without Lazarus in pure FPC you must make the path to LazUtils available. Yes, create your own FPC package, or simpler in this particular case, just copy the files fpcadds.pas and lazutils_defines.inc from the LazUtils folder into the project folder.

BrunoK

  • Hero Member
  • *****
  • Posts: 697
  • Retired programmer
Re: lcl-pdf is not UTF8?
« Reply #13 on: April 11, 2025, 06:05:15 pm »
I guess one thing to try is to tell the compiler that your code literals (and therefore the 'è') is in UTF8.

Code: Pascal  [Select][+][-]
  1. {$codepage UTF8}
Nimbus looks right here. For pure free pascal application, putting {$codepage UTF8} does effectively what he writes.

The program :
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4. {$codepage UTF8}
  5.  
  6. uses
  7.   {$IFDEF UNIX}
  8.   cthreads,
  9.   {$ENDIF}
  10.   Classes,
  11.   fppdf, fpttf;
  12.  
  13. var
  14.   FontID, FontBoldID: Integer;
  15.   Document: TPDFDocument;
  16.   Section: TPDFSection;
  17.   Page: TPDFPage;
  18.  
  19.  
  20. begin
  21.   Document := TPDFDocument.Create(nil);
  22.   Document.FontDirectory := 'C:\Windows\Fonts';
  23.   Document.Options := Document.Options + [poPageOriginAtTop, poNoEmbeddedFonts];
  24.   Document.StartDocument;
  25.   FontID := Document.AddFont('arial.ttf', 'Arial');
  26.   FontBoldID := Document.AddFont('arialbd.ttf', 'Arial Bold');
  27.  
  28.   Section := Document.Sections.AddSection;
  29.  
  30.   Page := Document.Pages.AddPage;
  31.   Section.AddPage(Page);
  32.  
  33.   Page.SetFont(FontID, 11);
  34.   Page.WriteText(20, 20, 'This is normal text');
  35.  
  36.   Page.SetFont(FontBoldID, 11);
  37.   Page.WriteText(20, 30, 'Hello world >>>> àéîöü <<<<');
  38.  
  39.   Document.SaveToFile('output.pdf');
  40.  
  41. end.
Outputs :

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12206
  • FPC developer.
Re: lcl-pdf is not UTF8?
« Reply #14 on: April 20, 2025, 01:15:19 pm »
Windows FPC programs are not utf-8 by default. Lazarus changes that as soon as you use the LCL.

Easiest way to make your console programs utf8 is to enable resources and tick the UTF8 default option.


 

TinyPortal © 2005-2018