@lainz:
Your original version of function UTF8UpperFirst() is overly complex and slow. The whole string is converted twice between encodings for no reason!
wp's version works but it has a slow and useless call to UTF8Length(Value) which can be replaced with Length(Value) or just MaxInt.
[
Edit:] Simple UpperCase can be used instead of UTF8UpperString.
Then it becomes:
Result := UpperCase(UTF8Copy(Value, 1, 1)) + UTF8Copy(Value, 2, Length(Value));
// or
Result := UpperCase(UTF8Copy(Value, 1, 1)) + UTF8Copy(Value, 2, MaxInt);
It can be further optimised by taking UTF8CharacterLength(Value) once and then using simple Copy() twice. Super-fast.
It is amazing how often you can use CodeUnit resolution with variable lenght encoding. I remember I got a wow-effect when I realized it. See examples:
http://wiki.freepascal.org/UTF8_strings_and_charactersPlease remember also my encoding agnostic functions if the code must be maintained between Delphi <-> Lazarus.
How I can convert (if needed) each case to newest lazarus with no usage of codepage. Thanks.
I think you are confusing things now. This thread is about using {$codepage UTF8} but it makes absolutely no difference for your code because it has no constants. There are 2 separate things:
1. Changing the default encoding of AnsiString (and String) variable type to UTF-8. This is now the recommended way and happens automatically for LCL applications. It can be disabled by -dDisableUTF8RTL if needed. This is a rather big change but mostly for the good.
2. {$codepage UTF8} only tells the compiler to treat string literals as UTF-8. It is a rather small issue because constants are less common than variables in normal code. The associated problems have easy workarounds, thus I think the problems have been greatly exaggarated.