Glad to be of help.
As I showed, by using the correct UTF16 string/char type (unicodestring/unicodechar), everything becomes much simpler.
Just typecasts.
UTF16--> UTF8 (for the Lazarus controls) is handled transparently in that case. This is likely to improve even more in the future.
And UTF8 is simply a dog to handle on its own as the eloquent - and working - code from Handoko merely demonstrates.
UTF8 is like shooting yourself in the foot on purpose.
I hope you now understand what my first question to you meant? Because there is a huge difference between all the unicode types and that is often very confusing.
Rule of thumb: in case of doubt, start with UTF16 (UnicodeString) even in Lazarus (defaults to utf8 Ansi hybrid) because the conversion from UnicodeString to UTF8 is much simpler than calling all kinds of utility functions and mappings. There are rare cases where this is still necessary, though.
Note for Lazarus developers: I was really impressed by the fact that right to left languages (as per my $0643) are handled so well! compliments!
Note for FPC developers: tnx for such a great typecasting system!