Recent

Author Topic: ANSI to UTF-8 or #228 ---> 'ä'  (Read 14665 times)

Bernd82

  • New Member
  • *
  • Posts: 30
ANSI to UTF-8 or #228 ---> 'ä'
« on: March 31, 2015, 09:13:14 pm »
I would like to display old Delphi ANSI strings (Windows) in a TMemo under Lazarus. The strings contain German and French special characters:
Code: [Select]
    VAR s:String;
    BEGIN
      s:=#228;
      Memo.Lines.Add(AnsiToUtf8(s));
    END;
This works with Lazarus under/for Windows. Will this also work if I compiled it under/for Linux? Thanks for any hint!

Regards Bernd

Blaazen

  • Hero Member
  • *****
  • Posts: 3237
  • POKE 54296,15
    • Eye-Candy Controls
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #1 on: March 31, 2015, 09:28:20 pm »
No, it does not work on Linux. There appears "?" in Memo.

But you can do this:
Code: [Select]
s:='ä';
Memo1.Lines.Add(s);
instead of using number, Lazarus is fully UTF8.
Lazarus 2.3.0 (rev main-2_3-2863...) FPC 3.3.1 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #2 on: March 31, 2015, 09:47:09 pm »
Code: [Select]
ShowMessage(#195 + #164      // tiny a
              + ' - ' +
              #195  + #132); // capital A     

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #3 on: March 31, 2015, 10:02:26 pm »
Thanks for your answers. So is there a chance to convert a string containing one-byte ANSI characters to a normal Lazarus string (UTF-8)?

Michl

  • Full Member
  • ***
  • Posts: 226
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #4 on: March 31, 2015, 10:08:03 pm »
There is a crosspost (the second in two days - thats realy not fine, as I had explained in German Lazarusforum):
http://www.lazarusforum.de/viewtopic.php?f=10&t=8663#p76727
Code: [Select]
type
  TLiveSelection = (lsMoney, lsChilds, lsTime);
  TLive = Array[0..1] of TLiveSelection;

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #5 on: March 31, 2015, 10:17:41 pm »
Code: [Select]
ShowMessage(#$C3#$A4
             + ' - ' +
             'A'#$CC#$84);

wp

  • Hero Member
  • *****
  • Posts: 11915
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #6 on: March 31, 2015, 10:33:07 pm »
This works under Linux:

Code: [Select]
uses
  lconvencoding;
...
var
  s: String;
begin
  s := CP1250ToUTF8(#228);
  // or: s := ISO_8859_1ToUTF8(#228);
  ShowMessage(s);
end;

typo

  • Hero Member
  • *****
  • Posts: 3051
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #7 on: March 31, 2015, 10:35:55 pm »

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #8 on: March 31, 2015, 10:51:43 pm »
Hello wp,

ah, unit lConvEncoding works with code pages. I understand. I estimate the CP1252 might be better in my case. It equals the ISO 8859-1 or Latin-1. I just fond this on Wikipedia:

http://en.wikipedia.org/wiki/Windows-1252

Thanks a lot for your competent information!

Best regards, Bernd

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #9 on: March 31, 2015, 10:53:53 pm »
Hello typo,

thanks for your answers. Yes that is the right character list for my conversions.

Regards, Bernd

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #10 on: March 31, 2015, 11:17:09 pm »
Hello Blaazen, thanks for your testing! Does this here work properly?
Code: [Select]
    USES LConvEncoding;
    VAR s:String;
    BEGIN
      s:=#228;
      Memo.Lines.Add(CP1252ToUTF8(s));
      Memo.Lines.Add(ISO_8859_1ToUTF8(s));
    END;
Thanks in advance for your help. Unfortunately I have no Linux here at the moment...

Regards Bernd

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #11 on: April 01, 2015, 12:05:51 am »
I had a closer look under Windows at all converted characters and found that

CP1252ToUTF8( )

seems to be the better alternative.

ISO_8859_1ToUTF8( )

has an undefined gap from character #128 to #159. For example the € sign is missing then...

Bernd

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11447
  • FPC developer.
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #12 on: April 01, 2015, 11:34:43 am »
In threads like this it is important to remember that "ansi" in freepascal/Delphi jargon means "default one byte codepage", not necessarily ansi/ascii.

The windows codepage for latin with euro is cp28591:

28591   iso-8859-1   ISO 8859-1 Latin 1; Western European (ISO)

and following:

28592   iso-8859-2   ISO 8859-2 Central European; Central European (ISO)

Bernd82

  • New Member
  • *
  • Posts: 30
Re: ANSI to UTF-8 or #228 ---> 'ä'
« Reply #13 on: April 01, 2015, 03:49:28 pm »
Hello marcov,

cp28591 like ISO_8859_1 has that gap between #128 and #159 or #$80 and #$9F! Look here:

https://msdn.microsoft.com/de-de/goglobal/cc305167.aspx

So the '€' and others are missing.

Regards, Bernd

 

TinyPortal © 2005-2018