Recent

Author Topic: Character Conversions  (Read 10540 times)

JLWest

  • Hero Member
  • *****
  • Posts: 1293
Re: Character Conversions
« Reply #45 on: September 29, 2019, 06:29:45 am »
Hi!

As I wrote my code is for the utf8 characters of the european langanges.

One reason is that I don't know nothing about chinese or Sanskrit.
The second reason is the nearly "endless" utf8 table. What is necessary?
Do we need Cherokee?

Winni

That's fine winni. I just hand translated the file. I think it works great. Now I have all my data files translated.

Thanks one and All
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

neuro

  • Jr. Member
  • **
  • Posts: 62
Re: Character Conversions
« Reply #46 on: March 19, 2022, 10:44:18 am »
if you run the same code using a text file the ???? start to appear. (Try a Lithuanian encoding - windows-1257 - still western, but with some twists in decoration, see Marco's remark - , or KOI-8, with which many forum users are familiar with)
It is still a very useful function, but not perfect. But it is actually short and pretty concise, which I like, except for Lithuanian.... so I have to read my wife's letters by guessing the question marks... ::)

Lithuanian CharSet Converter v.2.0
(free open-source cross-platform software)
        
Before UTF-8 character encoding adoption, Lithuanians had used different character encodings which were incompatible between each other.

“Lithuanian charset converter” converts between legacy character encodings and modern UTF-8.

“Lithuanian charset converter” converts between:
• ASCII;
• 772 / Lithuanian Standard LST 1284:1993 (Lithuanian and Russian characters) ; 774 / Lithuanian Standard LST 1283:1993 (Lithuanian and English characters) ; 775 (Microsoft);
• 770 / IBM Baltic / Lithuanian Standard RST 1095-89;
• 771 / KBL / Baltic Amadeus (Lithuanian and Russian characters) ; 773 Lithuanian (mix of 771 and 775);
• Windows-1257 / IBM Baltic RIM ; Latin-7 / ISO-8859-13;
• Latin-4 / ISO-8859-4 ; Latin-6 / ISO-8859-10;
• UTF-8 BOM (byte order mark);
• UTF-8.

LAMW source code for Android:
http://cognaxon.com/downloads/LithuanianCharSetConverter/SourceCode/LithuanianCharSetConverter_Lazarus_Android.zip

Lazarus source code for Linux, Windows, macOS:
http://cognaxon.com/downloads/LithuanianCharSetConverter/SourceCode/LithuanianCharSetConverter_Lazarus.tar.gz

Fred vS

  • Hero Member
  • *****
  • Posts: 3158
    • StrumPract is the musicians best friend
Re: Character Conversions
« Reply #47 on: March 19, 2022, 01:05:07 pm »
Hello.

Note that rendering of char is font dependent.

For example to list all the fonts compatible with Chinese ideograms:

In Linux (via terminal):
Code: Pascal  [Select][+][-]
  1. $> /usr/bin/fc-list :lang=zh --format="%{family[0]}\n" | sort | uniq

In Windows (resumed via EnumFontFamiliesEX from windows.pp) :
Code: Pascal  [Select][+][-]
  1.   lf.lfCharSet := 136 // Chineese
  2. ...
  3.    EnumFontFamiliesEX(DC, @lf, @EnumFontsNoDups, ptrint(L), 0);
I use Lazarus 2.2.0 32/64 and FPC 3.2.2 32/64 on Debian 11 64 bit, Windows 10, Windows 7 32/64, Windows XP 32,  FreeBSD 64.
Widgetset: fpGUI, MSEgui, Win32, GTK2, Qt.

https://github.com/fredvs
https://gitlab.com/fredvs
https://codeberg.org/fredvs

 

TinyPortal © 2005-2018