I'm looking for some help in the area of Unicode and a TMemo. I just upgraded to Lazarus 1.6 to see if it would fix my problem and I still see it.
For all of my code related to processing text related to the TMemo, I am currently using the UTF8 functions (UTF8Copy, UTF8Pos, UTF8Length and UTF8RightStr).
Everything in my code seems to be working OK, except in the area of selected text in the TMemo. What I have noticed is the getting text from the TMemo (after walking through the Lazarus code) is handled by converting UTF16 to UTF8. All of my code is defined as String and I use the aforementioned Utf8 functions.
But, I noticed that the selected text and the selection start values from the TMemo are not correct if some Unicode values are used (I am testing with a thumbs-up pic I stole from another forum message looks like- 👍). When a Unicode like that exists, the selected values do not work.
In tracing through the code, it looks like selection start just gets the value from the underlying Windows memo control. Since that control apparently uses UTF16 (otherwise why would the Lazarus code convert from UTF16 to UTF8), the start position would be based on UTF16 not UTF8. The selected text on the other hand, uses that same selection start, but works on UTF8 text. This appears to be why the value is incorrect.
I do not have to use selected text, I could use selection start and selection length and copy the text, but I would have the same issue as the underlying LCL code.
One thought I had was to instead of using the UTF8 functions, instead, define my Strings as UTF16 (would that be Utf16String, WideString or UnicodeString) and change my calls to Utf8 functions back to the normal functions of Pos, Length, Copy and RightStr (this assumes they work correctly on the the specified type).
Would that be the way to go?
I'm also concerned about other OSes. If Linux is UTF8 (I thought I read it was) then this would then break on Linux. So should I be using compiler directives and define my Strings (just those dealing with the TMemo) as Utf16 for Windows and Utf8 for non-Windows?
Am I way off-base here?
BTW, to test, just create a form, put a TMemo on it and a TButton. For the button click use the following code:
ShowMessage(Memo1.SelText);
Enter some text and select something.
With regular English text, the dialog displays the selected text. If you then use the thumbs-up character in the middle and select something at the end (or anywhere after it) and click the button, you will most likely get text short a character at the beginning.
Thanks