Lazarus

Miscellaneous => Suggestions => LCL => Topic started by: AlexTP on January 28, 2022, 09:23:03 pm

Title: TEdit.PasswordChar should be of type WideChar
Post by: AlexTP on January 28, 2022, 09:23:03 pm
I'd suggest to change Char type to WideChar. This will allow ppl to apply 'nice unicode thick dot' characters or even 'square' chars.
The LCL change will be easy.
Widgetsets will need only small changes.
Can I prepare the patches?
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: 440bx on January 28, 2022, 11:15:16 pm
I'd suggest to change Char type to WideChar.
Please no.  It's really nice that char is char, not something else.  This may not be important when writing normal applications but, when inspecting strings that are outside one's program, it is crucial that a char be a char and not anything else.

ETA:

I just realized that you are suggesting the change to be local to TEdit.PasswordChar, I don't have anything against that. :)
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: AlexTP on January 28, 2022, 11:18:38 pm
Why is it critical, that this is Char? What will your application loose?
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: 440bx on January 29, 2022, 12:06:35 am
Why is it critical, that this is Char? What will your application loose?
For instance, when trapping one of more APIs in some DLL - a system DLL or some application's DLL - all the exported (and imported) names are in char not widechar.  That's the most common instance but, in spite of what MS says, there are a fair number of strings/identifiers in the system that are char, not widechar.

Unicode is fine and useful for user interfaces and other things that may need to be translated to other languages but, at the system level it's crucial to be able to use char - most likely because the PE format is char based not Unicode based.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Zoran on January 29, 2022, 01:46:32 am
Any unicode character should be allowed, widechar is not the solution.
LCL uses UTF-8, so it should be utf-8 encoded (so, a string). Some program control should exist in property setter, which checks that only strings which represent one utf8 encoded character are allowed -- Utf8Length (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/utf8length.html) must return 1.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Thaddy on January 29, 2022, 08:22:27 am
Why is it critical, that this is Char? What will your application loose?
For instance, when trapping one of more APIs in some DLL - a system DLL or some application's DLL - all the exported (and imported) names are in char not widechar.  That's the most common instance but, in spite of what MS says, there are a fair number of strings/identifiers in the system that are char, not widechar.

Unicode is fine and useful for user interfaces and other things that may need to be translated to other languages but, at the system level it's crucial to be able to use char - most likely because the PE format is char based not Unicode based.

Mis-use of char detected, what you mean is byte sized AnsiChar? Depending on mode you confuse people by insisting "char". That is actually C'ism.
(I know it is not ideal, but Freepascal allows for de-contamination of such folly)
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: engkin on January 29, 2022, 09:15:29 am
Any unicode character should be allowed, widechar is not the solution.
LCL uses UTF-8, so it should be utf-8 encoded (so, a string). Some program control should exist in property setter, which checks that only strings which represent one utf8 encoded character are allowed -- Utf8Length (https://lazarus-ccr.sourceforge.io/docs/lazutils/lazutf8/utf8length.html) must return 1.

Why limit UTF8Length to 1?
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Thaddy on January 29, 2022, 09:57:53 am
Indeed, can be 4.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Bart on January 29, 2022, 10:23:33 am
The API actually accepts PassWordChar to be (when translated to UTF8) 4 codepoints wide?
(Or do the 4 codepoints then make up one grapheem?)

Bart
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: 440bx on January 29, 2022, 10:29:39 am
Mis-use of char detected, what you mean is byte sized AnsiChar? Depending on mode you confuse people by insisting "char". That is actually C'ism.
(I know it is not ideal, but Freepascal allows for de-contamination of such folly)
No Thaddy, it's a history-ism.  char has been the single byte character type for I'm not sure how many decades but definitely a good bit more than the yuppie ansichar and widechar.

One of the things I like best about Freepascal is that when I declare a variable to be of type "char" it's actually of type "char" ...Wow!!  Amazing, they still make compilers that parse by the rules.

Thank God for breathable air, underwear, women and the single byte character type.



Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: ASerge on January 29, 2022, 04:31:27 pm
The API actually accepts PassWordChar to be (when translated to UTF8) 4 codepoints wide?
4 code points is a lot. May be code units? For reference, in UTF8 encoding, the code unit is a byte, and the code point is up to 4 bytes.

But WinAPI, as far as I know, accepts only one codeunit.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: AlexTP on January 29, 2022, 04:32:27 pm
WinAPI accepts param of type WPARAM, it's LongInt. In the EM_SETPASSWORDCHAR.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Zoran on January 29, 2022, 05:23:30 pm
Why limit UTF8Length to 1?

Indeed, can be 4.

I believe that you misunderstood. I said, UTF8Length should be 1, and standard Length function can return up to 4, while UTF8Length still returns 1.

And LCL normally translates utf8 to what underlying widget's api expects, which in Windows would be an utf16 coded character.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: engkin on January 29, 2022, 08:19:10 pm
Why limit UTF8Length to 1?

Indeed, can be 4.

I believe that you misunderstood. I said, UTF8Length should be 1, and standard Length function can return up to 4, while UTF8Length still returns 1.

And LCL normally translates utf8 to what underlying widget's api expects, which in Windows would be an utf16 coded character.

Yes!! You are right. In this case the correct condition is:
UTF16Length is 1 to fit in WPARAM.

Because UTF8Length = 1 does not guarantee UTF16Length is not 2. In another way, it is limited to UCS-2.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Martin_fr on January 29, 2022, 08:42:31 pm
The API (i.e Windows) does not matter... It's for the LCL/WS to adapt the value so it can go to the API. Also the API is also qt, gtk, cocoa ....

The property is expected to hold a single "character". The LCL uses Utf8.

So, I say, the property should be TUtf8Char.

As for any limitation to what can be set to the property:
- Some checks (such as the byte sequence to be valid utf8) can be done in the property setter.
- Subset checks may depend... If some OS accept a bigger set of chars.... Then that may be a runtime check, with some fallback, or some other solution.

Of course, the smallest common subset can be chosen, but what if the next WS to be added has a smaller subset?


Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Zoran on January 29, 2022, 11:12:23 pm
The API (i.e Windows) does not matter... It's for the LCL/WS to adapt the value so it can go to the API. Also the API is also qt, gtk, cocoa ....

The property is expected to hold a single "character". The LCL uses Utf8.

So, I say, the property should be TUtf8Char.

As for any limitation to what can be set to the property:
- Some checks (such as the byte sequence to be valid utf8) can be done in the property setter.
- Subset checks may depend... If some OS accept a bigger set of chars.... Then that may be a runtime check, with some fallback, or some other solution.

Of course, the smallest common subset can be chosen, but what if the next WS to be added has a smaller subset?

Yes. And all this is actually what I said in the first place, isn't it?

Yes!! You are right. In this case the correct condition is:
UTF16Length is 1 to fit in WPARAM.

No, the correct condition is UTF8Length = 1.

Because UTF8Length = 1 does not guarantee UTF16Length is not 2. In another way, it is limited to UCS-2.

You are wrong. UTF8 covers whole unicode, all code points. It is not limited to ucs-2 subset.

LCL works with utf8 and, when needed, translates to what underlying widget api expects (in win widgetset it is utf16, in qt and gtk it is utf8, in any case we are by no means limited to ucs2).
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: engkin on January 30, 2022, 12:06:40 am
UTF8 covers whole unicode, all code points. It is not limited to ucs-2 subset.

Yes, I know, and here is why this can cause a problem on Windows. If you choose a code point outside the BMP you end up with UTF8Length = 1, but UTF16Length, in this case, is 2 and it will not fit in WPARAM.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: Zoran on January 30, 2022, 01:22:55 am
UTF8 covers whole unicode, all code points. It is not limited to ucs-2 subset.

Yes, I know, and here is why this can cause a problem on Windows. If you choose a code point outside the BMP you end up with UTF8Length = 1, but UTF16Length, in this case, is 2 and it will not fit in WPARAM.

Of course it will fit in wparam.
Wparam is defined as
Code: C  [Select][+][-]
  1. typedef UINT_PTR WPARAM;
So unsigned, pointer-sized integer, that is at least 32-bit unsigned int.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: engkin on January 30, 2022, 01:49:53 am
 :o I could not believe my eyes.
Thank you for correcting me.
Title: Re: TEdit.PasswordChar should be of type WideChar
Post by: PascalDragon on January 30, 2022, 11:38:24 am
One of the things I like best about Freepascal is that when I declare a variable to be of type "char" it's actually of type "char" ...Wow!!  Amazing, they still make compilers that parse by the rules.

The type of Char depends on the mode. In mode DelphiUnicode or if modeswitch UnicodeStrings is active the type Char is an alias to WideChar, while otherwise it's an alias to AnsiChar (just like String is an alias to UnicodeString in the former case and AnsiString in the later case, assuming that $H+ is set in both cases). This is due to compatibility with Delphi 2009 and newer.
TinyPortal © 2005-2018