Recent

Author Topic: [SOLVED] What is the meaning of TFont.CharSet? How to use it?  (Read 2417 times)

Hartmut

  • Hero Member
  • *****
  • Posts: 844
[SOLVED] What is the meaning of TFont.CharSet? How to use it?
« on: November 28, 2023, 07:07:26 pm »
I'm a beginner to Fonts and type 'TFont' and I want to know, what is the meaning of TFont.CharSet? The Help only says: "The Character Set to be used by the current Font". But what is a "Character Set" as a part of a font? You can assign a value from 0..255 there.

Question1: what happens / should happen, if I assign an arbitrary value to TFont.CharSet (I tested some values and saw no difference)?

Question2: are all 256 values always allowed? Where from do I know, which numbers are "allowed"  for a specific font?

Some explanations are very welcome. Thanks in advance.
« Last Edit: December 02, 2023, 11:02:08 am by Hartmut »

Josh

  • Hero Member
  • *****
  • Posts: 1344
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #1 on: November 28, 2023, 07:18:15 pm »
have you looked at the examples in lazarus
lazarus\examples\fontenum
the mainunit.pas around line 200 where it poulates, may give you some idea

Code: [Select]
Add(ANSI_CHARSET);
  Add(DEFAULT_CHARSET);
  Add(SYMBOL_CHARSET);
  Add(MAC_CHARSET);
  Add(SHIFTJIS_CHARSET);
  Add(HANGEUL_CHARSET);
  Add(JOHAB_CHARSET);
  Add(GB2312_CHARSET);
  Add(CHINESEBIG5_CHARSET);
  Add(GREEK_CHARSET);
  Add(TURKISH_CHARSET);
  Add(VIETNAMESE_CHARSET);
  Add(HEBREW_CHARSET);
  Add(ARABIC_CHARSET);
  Add(BALTIC_CHARSET);
  Add(RUSSIAN_CHARSET);
  Add(THAI_CHARSET);
  Add(EASTEUROPE_CHARSET);
  Add(OEM_CHARSET);
  Add(FCS_ISO_10646_1);
  Add(FCS_ISO_8859_1);
  Add(FCS_ISO_8859_2);
  Add(FCS_ISO_8859_3);
  Add(FCS_ISO_8859_4);
  Add(FCS_ISO_8859_5);
  Add(FCS_ISO_8859_6);
  Add(FCS_ISO_8859_7);
  Add(FCS_ISO_8859_8);
  Add(FCS_ISO_8859_9);
  Add(FCS_ISO_8859_10);
  Add(FCS_ISO_8859_15);

there are no doubt others,
The best way to get accurate information on the forum is to post something wrong and wait for corrections.

Hartmut

  • Hero Member
  • *****
  • Posts: 844
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #2 on: November 29, 2023, 11:17:32 am »
Thanks Josh for your reply. I had already seen this list in lazarus example "fontenum". I played with this example, where you have a ComboBox and a ListBox with CharSets, but I see no difference, when I select something like "CHINESEBIG5_CHARSET" or "HEBREW_CHARSET" instead of "DEFAULT_CHARSET" or "ANSI_CHARSET" and press the "apply filter" Button, the displayed Sample Text does not change!

So my questions are still open:
 - What happens / should happen, if I assign an arbitrary value to TFont.CharSet (I tested some values and saw no difference)?
 - Are all 256 values always allowed? Where from do I know, which numbers do work (make a difference) for a specific font?
 - What is the "philosophy" behind TFont.CharSet?

kwyan

  • New Member
  • *
  • Posts: 25
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #3 on: November 29, 2023, 05:00:25 pm »
A font file may support multiple character sets and hence determines which languages are supported. You may use font tools to view which character sets are supported.

Hartmut

  • Hero Member
  • *****
  • Posts: 844
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #4 on: November 29, 2023, 05:35:18 pm »
Thank you kwyan for your post.

A font file may support multiple character sets and hence determines which languages are supported.

Does this mean: each font file contains the ASCII-Codes $20..$7F and optional may contain the UTF8-chars for 1 or more specific languages (e.g. Japanese or Arabic) and this additional languages are called charsets?

Is it possible with FPC/Lazarus/LCL to list those "additional charsets" of a certain font, when only the name of the font is given? How?

Quote
You may use font tools to view which character sets are supported.

Is "font tools" the name of a certain program? I used google to find it but failed...

Can you/someone recommend a certain program, which can list those "additional charsets" of a certain font?

Sorry for so many questions... Thanks for any help.

kwyan

  • New Member
  • *
  • Posts: 25
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #5 on: November 30, 2023, 07:04:14 am »
A font file contains many glyphs (a glyph is a single representation of how to draw a character). You may think charset is the mapping of ascii code to glyphs.

I don't know if someone write pascal program to read the charset.

I use FontForge (An Open Source Font Editor) to view or edit the font file.

Hartmut

  • Hero Member
  • *****
  • Posts: 844
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #6 on: November 30, 2023, 05:28:28 pm »
Thanks kwyan for that infos. I installed FontForge and found the list of CharSets in Menu Element / Font Info / OS/2 / Charsets. But this leads to the next question:

When a font contains multiple CharSets (e.g. Chinese and Arabic), then this diffenrent CharSets will have *different* UTF8-Codes, or not? If different, I don't understand, why I can/should/must assign 'TFont.CharSet' a (different) number? When I want a Chinese character, then I must use it's UTF8-Code. When I want an Arabic character, then I must use another UTF8-Code. Is this not enough? What ist the purpose of assigning 'TFont.CharSet' a (different) number?

kwyan

  • New Member
  • *
  • Posts: 25
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #7 on: November 30, 2023, 06:03:49 pm »
Let's say, if a font has glyph for this character Ä

UTF8: U+00C4 map to Ä
CP850: $8E map to Ä
CP1252: $C4 map to Ä

Chinese has BIG5, GB as well as UTF-8. If I give you $A741 without telling you the encoding, you don't know which Chinese char it is. $A741 in BIG5 is 你 but $A741 in GB will be another character (if exists).

TRon

  • Hero Member
  • *****
  • Posts: 3626
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #8 on: November 30, 2023, 06:06:17 pm »
This tagline is powered by AI (AI advertisement: Free Pascal the only programming language that matters)

MarkMLl

  • Hero Member
  • *****
  • Posts: 8032
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #9 on: November 30, 2023, 06:32:25 pm »
I'd retroactively add another vote for FontForge. It's the fundamental editor for font files, in the same way that InkScape is the fundamental (open-source) editor for vector illustrations.

I'm uncomfortable getting directly involved, but my suspicion is that Charset might mean slightly different things to different OSes or at least graphical environments.

In a unix (strictly, X11) context, a font might be named (e.g. by xlsfonts) something like

-misc-fixed-bold-r-normal--0-0-100-100-c-0-iso8859-1

sometimes described (e.g. by xfontsel) as having the fields

-fndry-fmly-wght-slant-sWdth-adstyl-pxSize-ptSz-resx-resy-spc-avgWdth-rgstry-encdng

X11 widget sets will generally try to "do the right thing" if given a font specification for which they don't have a file, or if a field contains * as a wildcard.

The last two of those fields are the registry and encoding, and I've seen those two together referred to as the character set. So the example file I gave assumed the iso8859 registry and the -1 encoding.

The implication there is that one of the registries is UTF, and one of its encodings is -8. But that would appear to imply that the number of distinct codepointsin a character set is a function of the registry, and what glyph is placed in each codepoint is a function of the encoding.

However as I've said, that could be OS-specific and the LCL's behaviour could be influenced by both the OS and by Delphi's historic behaviour running on Windows.

In the case of X11 then https://en.wikipedia.org/wiki/X_logical_font_description might help, plus the Flowers reference cited. However I can't find anything useful about assigning a numeric value to the character set: that might be a "Windows-ism" and might imply that the field is ignored on other OSes.

Ultimately, the "easiest" way of getting something definitive might be to trace into the LCL with a debugger, and seeing what different character sets do when they hit the underlying API.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Hartmut

  • Hero Member
  • *****
  • Posts: 844
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #10 on: December 01, 2023, 02:38:06 pm »
Thanks to all for your new infos. Now I see some things clearer.

Then I found this in the wellknown Lazarus-book from Michael van Canneyt in chapter 9.1.4 "Fonts":

"CharSet: does only exist for compatibility reasons with Delphi and should not be used, because Lazarus has full Unicode-support. With UTF-8 coded strings there is no reason, to select a certain Codepage." (translated from German)

If nobody contradicts to this, I will set this Topic to "solved".

kwyan

  • New Member
  • *
  • Posts: 25
Re: What is the meaning of TFont.CharSet? How to use it?
« Reply #11 on: December 01, 2023, 04:45:14 pm »
I agree that if we should use UTF-8 if possible. Just for information, still many thermal receipt printers don't support UTF-8. I just bought one which only support BIG5 for Chinese.

You may set this Topic to "solved".

 

TinyPortal © 2005-2018