Lazarus

Free Pascal => General => Topic started by: zamtmn on February 02, 2023, 07:42:38 am

Title: Unicode resourcestring
Post by: zamtmn on February 02, 2023, 07:42:38 am
Is it possible in the future to split resource strings into utf8 and utf16? of course, everything is working now, but we have to fight with warnings
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. resourcestring
  3.   rstest='blbla-blbla';
  4. var
  5.   vtest:unicodestring;
  6. begin
  7.   vtest:=rstest;
  8. end.
>>project1.lpr(7,10) Warning: Implicit string type conversion from "AnsiString" to "UnicodeString"
Title: Re: Unicode resourcestring
Post by: PascalDragon on February 02, 2023, 09:22:24 pm
We are currently working on a UnicodeString based RTL. In that RTL (but not the Ansi-RTL) resource strings will be UnicodeString, otherwise they won't.
Title: Re: Unicode resourcestring
Post by: zamtmn on February 02, 2023, 09:29:42 pm
that is, for something like this:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. resourcestring
  3.   rstest='blbla-blbla';
  4.   rstest1:unicodestring='blbla-blbla';
  5.   rstest2:ansistring='blbla-blbla';
  6. begin
  7. end.
there is no hope((
Title: Re: Unicode resourcestring
Post by: jcmontherock on February 02, 2023, 09:54:34 pm
@PascalDragon:

Which UnicodeString ? UTF-8 or UTF-16 ?
Title: Re: Unicode resourcestring
Post by: PascalDragon on February 03, 2023, 03:55:37 pm
that is, for something like this:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. resourcestring
  3.   rstest='blbla-blbla';
  4.   rstest1:unicodestring='blbla-blbla';
  5.   rstest2:ansistring='blbla-blbla';
  6. begin
  7. end.
there is no hope((

Correct.

@PascalDragon:

Which UnicodeString ? UTF-8 or UTF-16 ?

When I said UnicodeString I meant UnicodeString. With that information you should know which encoding is used.
Title: Re: Unicode resourcestring
Post by: jcmontherock on February 03, 2023, 05:36:02 pm
Unicode string means generally UTF-16. UTF-16 could be Le or Be.
Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D
Title: Re: Unicode resourcestring
Post by: Thaddy on February 03, 2023, 05:45:49 pm
Unicode string means generally UTF-16. UTF-16 could be Le or Be.
Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D
It is the other way around. MS uses unicode 16 (based on UCS) and the rest of the world uses UTF8.
FPC/Lazarus is not MS centric. Personal opinion is that I really dislike UTF8, but I have to use it, because I do server platforms and most of those are unix derived..
Title: Re: Unicode resourcestring
Post by: KodeZwerg on February 03, 2023, 06:33:23 pm
Unicode string means generally UTF-16. UTF-16 could be Le or Be.
Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D
:o
It is the other way around. MS uses unicode 16 (based on UCS) and the rest of the world uses UTF8.
FPC/Lazarus is not MS centric. Personal opinion is that I really dislike UTF8, but I have to use it, because I do server platforms and most of those are unix derived..
:o :o

...you both crush my world with your sentences... (https://en.wikipedia.org/wiki/Unicode)
UTF8 (https://en.wikipedia.org/wiki/UTF-8) / UTF16 (https://en.wikipedia.org/wiki/UTF-16) is both just a different bitness as an encoding format used by Unicode.
Unicode is Unicode, nothing more nothing less.
Title: Re: Unicode resourcestring
Post by: winni on February 03, 2023, 06:40:45 pm

...you both crush my world with your sentences... (https://en.wikipedia.org/wiki/Unicode)
UTF8 (https://en.wikipedia.org/wiki/UTF-8) / UTF16 (https://en.wikipedia.org/wiki/UTF-16) is both just a different bitness as an encoding format used by Unicode.
Unicode is Unicode, nothing more nothing less.

Hi!

Ans why are Windows strings when I read them with Linux are wrong encoded?

Winni
Title: Re: Unicode resourcestring
Post by: KodeZwerg on February 03, 2023, 06:47:35 pm
Windows strings
As soon as I know the definition of "Windows strings" and how you decode I can answer that.
Title: Re: Unicode resourcestring
Post by: winni on February 03, 2023, 07:28:20 pm
Windows strings
As soon as I know the definition of "Windows strings" and how you decode I can answer that.

UTF 16
Title: Re: Unicode resourcestring
Post by: KodeZwerg on February 03, 2023, 07:54:41 pm
In unit LazUTF8 (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/index-5.html) are some generic methods to deal with UTF16, like ConvertUTF16ToUTF8 (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/convertutf16toutf8.html) or UTF16ToUTF8 (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/utf16toutf8.html) or UnicodeToUTF8 (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/unicodetoutf8.html) and many many more neat stuff to find out.

Does that help your issue Winni?
I am sorry, I do just have Windows and can not reproduce where you having problems with.
Title: Re: Unicode resourcestring
Post by: tetrastes on February 03, 2023, 08:52:10 pm
Unicode string means generally UTF-16. UTF-16 could be Le or Be.
Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D
And UnicodeString (without space) means https://www.freepascal.org/docs-html/current/ref/refsu10.html#x33-430003.2.5, as we talk about FPC here.
Title: Re: Unicode resourcestring
Post by: winni on February 03, 2023, 10:54:00 pm
@KodeZwerg

I know how to solve the problem.

It was just an example against your wrong statement

"Unicode is Unicode, nothing more nothing less."

Winni


Title: Re: Unicode resourcestring
Post by: KodeZwerg on February 03, 2023, 11:23:10 pm
@KodeZwerg

I know how to solve the problem.

It was just an example against your wrong statement

"Unicode is Unicode, nothing more nothing less."

Winni
You kidding me right?
What else is Unicode when not Unicode?
Please share me your wisdom.
Title: Re: Unicode resourcestring
Post by: winni on February 03, 2023, 11:40:17 pm
Please share me your wisdom.


UTF8 <> UTF16
Title: Re: Unicode resourcestring
Post by: KodeZwerg on February 04, 2023, 12:08:28 am
@KodeZwerg
It was just an example against your wrong statement
"Unicode is Unicode, nothing more nothing less."
What else is Unicode when not Unicode?
UTF8 <> UTF16
Since you do not use words to answer a normal question but character encoding standards, I have absolute no Idea why you quoted me at all.
That answer makes no sense and my statement is still the same.
Unicode is Unicode, nothing more nothing less - period.
( but I am still open to be convinced by something different  :-* )

Excuse me moderators for this kinda offtopic argue.
Title: Re: Unicode resourcestring
Post by: PascalDragon on February 06, 2023, 11:01:55 pm
Unicode string means generally UTF-16. UTF-16 could be Le or Be.

And FPC always uses the endianess of the target, so it makes no sense to talk about UTF-16BE vs. UTF-16LE in the general sense in the context of FPC.

Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D

We are on a FPC/Lazarus forum here, so I can assume that the users know when I write “UnicodeString” that I mean UnicodeString which in turn means UTF-16 (no matter if BE or LE) and not “Unicode string” which might mean any string with a Unicode encoding.

Unicode string means generally UTF-16. UTF-16 could be Le or Be.
Some people speaks, wrong, Unicode string for UTF-8. Please, use the exact denomination...  ;D ;D
It is the other way around. MS uses unicode 16 (based on UCS) and the rest of the world uses UTF8.

Qt uses UTF-16, JavaScript uses UTF-16, Java uses UTF-16. And on non-Windows wchar_t is even four Byte and thus uses UTF-32.
TinyPortal © 2005-2018