Lazarus

Programming => Packages and Libraries => LazUtils => Topic started by: MNeto on July 12, 2016, 11:26:15 pm

Title: About fpttf and fpparsettf
Post by: MNeto on July 12, 2016, 11:26:15 pm
Hello.

In this topic:

http://forum.lazarus.freepascal.org/index.php/topic,33141.0.html (http://forum.lazarus.freepascal.org/index.php/topic,33141.0.html)

I learned about the units fpdf and fpparsettf (fcl-pdf), created by Graeme.

I have two questions:

1. It is possible get a unicode character that matches the one GlyphID?
2. (This question is not directly related to fpparsettf but the FreePascal). Some fonts have characters that are outside the Basic Multilingual Plane, as #$1D435. fpparsettf can handle it normally, since it accesses the font directly. But I can not display these characters using the Free Pascal (I think it only supports Basic Multilingual Plane). Is there any way around this limitation?

Thanks!
Title: Re: About fpttf and fpparsettf
Post by: engkin on July 13, 2016, 03:32:36 am
1. It is possible get a unicode character that matches the one GlyphID?

In general, IIRC, more than one unicode character could point to one GlyphID.

The data you are looking for is in a table called CMAP. Its function is to find GlyphIDs from character codes in a font file, that is the opposite to what you are trying to do.

TTFFileInfo in fpparsettf parses CMAP and stores the result in CMapH and dynamic array Chars.

2. (This question is not directly related to fpparsettf but the FreePascal). Some fonts have characters that are outside the Basic Multilingual Plane, as #$1D435. fpparsettf can handle it normally, since it accesses the font directly.

It can but it does not. The parser in fpparsettf unit, last time I checked, supports format 4 of platform 3 and encoding 1. That is Windows fonts with Unicode BMP encoding. To support non-BMP encoding it needs format 12 at least.

But I can not display these characters using the Free Pascal (I think it only supports Basic Multilingual Plane). Is there any way around this limitation?

If the font you are using does not have that character, FPC can not change that fact. I don't seem to have a font that includes MATHEMATICAL ITALIC CAPITAL B (#$1D435)

Show your code.
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 13, 2016, 07:38:08 pm
Thanks for answering.

Quote
In general, IIRC, more than one unicode character could point to one GlyphID.

Curious. I assumed every GlyphID corresponds a single unicode value ...

Quote
If the font you are using does not have that character, FPC can not change that fact. I don't seem to have a font that includes MATHEMATICAL ITALIC CAPITAL B (#$1D435)

In fact, some OpenType specialized fonts already have these characters. For example, XITSMath font. The range is 'Mathematical Alphanumeric Symbols'.

When I try to assign the escape sequence #$1D435, or its corresponding decimal #119861, I get the error 'illegal char constant', because the sequence is outside the range allowed.

Characters of this type would already be in the scope of UTF16...
The most I could do was find routines that convert some values in UTF16 to UTF8, but for this range of characters that will not work, because characters like #$1D435 does not have correspondent in UTF8.
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 13, 2016, 08:14:16 pm
I solved the problem of attribution using the following code. But printing of the character is incorrect.

Code: Pascal  [Select][+][-]
  1. var
  2.   c : WideString;
  3.  
  4. ...
  5.  
  6. {c := #$1D435; // don't work}
  7. c := WideChar($1D4A5); //OK
  8. Label1.Caption := c;
  9.  

The Label control is with the XITS Math font. The code compiles, but the printed character is incorrect.
Title: Re: About fpttf and fpparsettf
Post by: engkin on July 13, 2016, 10:17:51 pm
Make sure the label is using XITSMath font:
Code: Pascal  [Select][+][-]
  1. uses
  2.   LazUTF8;
  3. ...
  4. var
  5.   s: string;
  6. ...
  7.   s := UnicodeToUTF8($1D435);
  8.   Label1.Caption := s;
  9.  

Lazarus uses UTF8 encoding.
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 13, 2016, 10:37:50 pm
Thanks!

Because the character is UTF16, I modified your code and it worked (I'm using Qt):

Code: Pascal  [Select][+][-]
  1. uses
  2.   LazUTF16;
  3. ...
  4. var
  5.   s: string;
  6. ...
  7.   s := UnicodeToUTF16($1D435);
  8.   Label1.Caption := s;
  9.  
Title: Re: About fpttf and fpparsettf
Post by: JuhaManninen on July 13, 2016, 11:22:31 pm
MNeto, "String" is UTF-8 in Lazarus by default. This:
Code: Pascal  [Select][+][-]
  1. s := UnicodeToUTF16($1D435);
does not make much sense. It only triggers extra automatic conversions.
See:
 http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus

Quote
...because characters like #$1D435 does not have correspondent in UTF8.

Not true. Data between any Unicode encodings can be converted losslessly.
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 14, 2016, 12:23:25 am
Really...
It works with UTF8 ...

Now I understand. I thought, incorrectly, unicode SMP could only be represented as UTF16.

Why SMP unicode characters behave differently from BMP unicode characters in Lazarus?

Thank you.
Title: Re: About fpttf and fpparsettf
Post by: Graeme on July 14, 2016, 12:29:39 am
...but for this range of characters that will not work, because characters like #$1D435 does not have correspondent in UTF8.
Uh? All Unicode codepoints can be represented in both UTF-16 and UTF-8.

U+1D435 MATHEMATICAL ITALIC CAPITAL B
UTF-8: 0xF0 0x9D 0x90 0xB5
UTF-16: 0xD835 0xDC35
XML decimal entity: 𝐵
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 14, 2016, 01:57:17 am
Quote
Uh? All Unicode codepoints can be represented in both UTF-16 and UTF-8.

Understand.

I had a wrong idea of the Unicode standard. Sorry, I am not an experienced programmer.

I thought the unicode value of a character was just a number in hexadecimal. I did not know their internal representation.
Title: Re: About fpttf and fpparsettf
Post by: MNeto on July 14, 2016, 03:19:03 am
One more doubt:

If I make a modification, or customization, in a unit of the Free Pascal, I'm hurting license FreePascal or Lazarus?

What is the correct way to proceed in this case?

I am intensely studying the unit fpparsettf and the specification of the OpenType font (and the standard unicode). I would like to add support for some tables, the 12 format for the CMAP table, etc.

Thank you.
Title: Re: About fpttf and fpparsettf
Post by: JuhaManninen on July 14, 2016, 10:41:13 am
If I make a modification, or customization, in a unit of the Free Pascal, I'm hurting license FreePascal or Lazarus?

No. It is (L)GPL. The only limitations are that your derived work must also be (L)GPL, and you must publish the source code if you deliver your derived work.

Quote
What is the correct way to proceed in this case?

Just copy and modify all (L)GPL code you can find without shame.
The whole idea of this license is to allow it.
TinyPortal © 2005-2018