Yes. But that is all temporary till classes is unicodestring. And we won't adapt FPC code to that scheme.
Ok, this plan may be the real reason to oppose UTF-8 so much. I did not realize there is so strong confrontation between the "camps", especially as FPC provided the functions helping with default encoding.
I have a personal interest with SW that will need UTF-8 all over for various reasons. I have no interest to oppose other solutions. First I thought UTF-8 must be done using the old way with AnsiString + UTF...() functions. Then I learned RTL could be mapped to UTF-8 and it worked better I had hoped. As I wrote earlier here "It is almost too good to be true!" and yes it was too good to be true...
I promise to work towards the UTF-16 solution later, but first I need the UTF-8. In the worst case it is the old AnsiString + UTF...() functions but then I am a little disappointed. We were so close to get this working:
http://wiki.freepascal.org/Better_LCL_Unicode_SupportI also try to keep this as pragmatic as possible, there have been enough "camp" fight during past 5 years.
So, I am here to find a working UTF-8 solution, not to fight against other solutions.
Still, the functions for changing encoding should be removed if their usage is forbidden.
Marco, you know Unicode better than I but I have also learned something. These things are valid in my use-case, don't know if they are valid for others. So don't get angry.
1. The solution we made is amazingly Delphi compatible. String Ansi... functions work and the ASCII functions. Even Pos() and Copy() are compatible in most cases. In Delphi they are used because people treat UTF-16 as a fixed width encoding, with UTF-8 they work most often because of the special properties of this encoding.
When looking at some real Delphi code, there are very few things to change.
2. 100% Delphi compatibility is not always a blessing, it can be a curse. Typical Delphi code still assumes a character is fixed width 16 bits. Tutorials and examples feed that same wrong idea. For example an article from Nick Hodges :
http://edn.embarcadero.com/article/38693says: "Copy will still work as before without change. So will Delete and all the SysUtils-based string manipulation routines."
I know codepoints with 2 UnicodeChar are rare in west, but maybe the application is marketed to China some day and then the code breaks. Copy() will get half a codepoint.
UTF-8 code must be done right always when dealing with individual codepoints.
3. I have done cross-platform code that reads an XML file, parses it and does something with the data. This all using UTF-8 encoding of LCL.
The file is already encoded as UTF-8, there is no single conversion needed for it. I think the file-open WinAPI call is the only place where filename encoding must be converted. The actual file-read block operation does not care about encodings (I think).
This code is not specific to Unix or any other operating system. So, I honestly don't understand your sentense:
"no utf8 usage in code on Windows except for ported Unix software".
Anyway, I will take what is given from FPC team. I understand there are camps inside the team which complicates the issue.