Recent

Author Topic: About fpttf and fpparsettf  (Read 13451 times)

MNeto

  • New Member
  • *
  • Posts: 18
About fpttf and fpparsettf
« on: July 12, 2016, 11:26:15 pm »
Hello.

In this topic:

http://forum.lazarus.freepascal.org/index.php/topic,33141.0.html

I learned about the units fpdf and fpparsettf (fcl-pdf), created by Graeme.

I have two questions:

1. It is possible get a unicode character that matches the one GlyphID?
2. (This question is not directly related to fpparsettf but the FreePascal). Some fonts have characters that are outside the Basic Multilingual Plane, as #$1D435. fpparsettf can handle it normally, since it accesses the font directly. But I can not display these characters using the Free Pascal (I think it only supports Basic Multilingual Plane). Is there any way around this limitation?

Thanks!

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: About fpttf and fpparsettf
« Reply #1 on: July 13, 2016, 03:32:36 am »
1. It is possible get a unicode character that matches the one GlyphID?

In general, IIRC, more than one unicode character could point to one GlyphID.

The data you are looking for is in a table called CMAP. Its function is to find GlyphIDs from character codes in a font file, that is the opposite to what you are trying to do.

TTFFileInfo in fpparsettf parses CMAP and stores the result in CMapH and dynamic array Chars.

2. (This question is not directly related to fpparsettf but the FreePascal). Some fonts have characters that are outside the Basic Multilingual Plane, as #$1D435. fpparsettf can handle it normally, since it accesses the font directly.

It can but it does not. The parser in fpparsettf unit, last time I checked, supports format 4 of platform 3 and encoding 1. That is Windows fonts with Unicode BMP encoding. To support non-BMP encoding it needs format 12 at least.

But I can not display these characters using the Free Pascal (I think it only supports Basic Multilingual Plane). Is there any way around this limitation?

If the font you are using does not have that character, FPC can not change that fact. I don't seem to have a font that includes MATHEMATICAL ITALIC CAPITAL B (#$1D435)

Show your code.

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #2 on: July 13, 2016, 07:38:08 pm »
Thanks for answering.

Quote
In general, IIRC, more than one unicode character could point to one GlyphID.

Curious. I assumed every GlyphID corresponds a single unicode value ...

Quote
If the font you are using does not have that character, FPC can not change that fact. I don't seem to have a font that includes MATHEMATICAL ITALIC CAPITAL B (#$1D435)

In fact, some OpenType specialized fonts already have these characters. For example, XITSMath font. The range is 'Mathematical Alphanumeric Symbols'.

When I try to assign the escape sequence #$1D435, or its corresponding decimal #119861, I get the error 'illegal char constant', because the sequence is outside the range allowed.

Characters of this type would already be in the scope of UTF16...
The most I could do was find routines that convert some values in UTF16 to UTF8, but for this range of characters that will not work, because characters like #$1D435 does not have correspondent in UTF8.

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #3 on: July 13, 2016, 08:14:16 pm »
I solved the problem of attribution using the following code. But printing of the character is incorrect.

Code: Pascal  [Select][+][-]
  1. var
  2.   c : WideString;
  3.  
  4. ...
  5.  
  6. {c := #$1D435; // don't work}
  7. c := WideChar($1D4A5); //OK
  8. Label1.Caption := c;
  9.  

The Label control is with the XITS Math font. The code compiles, but the printed character is incorrect.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: About fpttf and fpparsettf
« Reply #4 on: July 13, 2016, 10:17:51 pm »
Make sure the label is using XITSMath font:
Code: Pascal  [Select][+][-]
  1. uses
  2.   LazUTF8;
  3. ...
  4. var
  5.   s: string;
  6. ...
  7.   s := UnicodeToUTF8($1D435);
  8.   Label1.Caption := s;
  9.  

Lazarus uses UTF8 encoding.

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #5 on: July 13, 2016, 10:37:50 pm »
Thanks!

Because the character is UTF16, I modified your code and it worked (I'm using Qt):

Code: Pascal  [Select][+][-]
  1. uses
  2.   LazUTF16;
  3. ...
  4. var
  5.   s: string;
  6. ...
  7.   s := UnicodeToUTF16($1D435);
  8.   Label1.Caption := s;
  9.  
« Last Edit: July 13, 2016, 10:41:37 pm by MNeto »

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4459
  • I like bugs.
Re: About fpttf and fpparsettf
« Reply #6 on: July 13, 2016, 11:22:31 pm »
MNeto, "String" is UTF-8 in Lazarus by default. This:
Code: Pascal  [Select][+][-]
  1. s := UnicodeToUTF16($1D435);
does not make much sense. It only triggers extra automatic conversions.
See:
 http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus

Quote
...because characters like #$1D435 does not have correspondent in UTF8.

Not true. Data between any Unicode encodings can be converted losslessly.
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #7 on: July 14, 2016, 12:23:25 am »
Really...
It works with UTF8 ...

Now I understand. I thought, incorrectly, unicode SMP could only be represented as UTF16.

Why SMP unicode characters behave differently from BMP unicode characters in Lazarus?

Thank you.

Graeme

  • Hero Member
  • *****
  • Posts: 1428
    • Graeme on the web
Re: About fpttf and fpparsettf
« Reply #8 on: July 14, 2016, 12:29:39 am »
...but for this range of characters that will not work, because characters like #$1D435 does not have correspondent in UTF8.
Uh? All Unicode codepoints can be represented in both UTF-16 and UTF-8.

U+1D435 MATHEMATICAL ITALIC CAPITAL B
UTF-8: 0xF0 0x9D 0x90 0xB5
UTF-16: 0xD835 0xDC35
XML decimal entity: 𝐵
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #9 on: July 14, 2016, 01:57:17 am »
Quote
Uh? All Unicode codepoints can be represented in both UTF-16 and UTF-8.

Understand.

I had a wrong idea of the Unicode standard. Sorry, I am not an experienced programmer.

I thought the unicode value of a character was just a number in hexadecimal. I did not know their internal representation.

MNeto

  • New Member
  • *
  • Posts: 18
Re: About fpttf and fpparsettf
« Reply #10 on: July 14, 2016, 03:19:03 am »
One more doubt:

If I make a modification, or customization, in a unit of the Free Pascal, I'm hurting license FreePascal or Lazarus?

What is the correct way to proceed in this case?

I am intensely studying the unit fpparsettf and the specification of the OpenType font (and the standard unicode). I would like to add support for some tables, the 12 format for the CMAP table, etc.

Thank you.

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4459
  • I like bugs.
Re: About fpttf and fpparsettf
« Reply #11 on: July 14, 2016, 10:41:13 am »
If I make a modification, or customization, in a unit of the Free Pascal, I'm hurting license FreePascal or Lazarus?

No. It is (L)GPL. The only limitations are that your derived work must also be (L)GPL, and you must publish the source code if you deliver your derived work.

Quote
What is the correct way to proceed in this case?

Just copy and modify all (L)GPL code you can find without shame.
The whole idea of this license is to allow it.
« Last Edit: July 14, 2016, 11:31:24 am by JuhaManninen »
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

 

TinyPortal © 2005-2018