Forum > General
[solved] Case with unicode switch and -FcUTF8
Ocye:
--- Quote from: JuhaManninen on October 14, 2015, 02:28:55 pm ---It apparently causes a swamp of nasty issues...
--- End quote ---
And I'm completely confused now ;-)
Is there a 'small' compiler switch like {$UTF8+/-}?
--- Quote from: GetMem on October 14, 2015, 02:43:02 pm ---That case hurts my eyes...
--- End quote ---
True, but readability is better. And at some point I have to convert the drop down selection (which is dynamically filled).
Anyway, the question is rather about the restrictions of UTF8 in RTL.
--- Quote from: Roland Chastain on October 14, 2015, 02:46:30 pm ---
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---'Fran'#$C3#$A7'ais': writeln('Salute Monde');
--- End quote ---
And now for Greek (Ελληνικά) ;D
--- Quote from: Bart on October 14, 2015, 03:11:27 pm ---What is the encoding of the source-file in question?
--- End quote ---
How do I check that? Notepad++ tells me the file is UTF8 without BOM (saving explicitly as UTF8 makes no difference). I use to copy files from Linux to Windows.
JuhaManninen:
--- Quote from: Ocye on October 14, 2015, 03:17:00 pm ---
--- Quote from: JuhaManninen on October 14, 2015, 02:28:55 pm ---It apparently causes a swamp of nasty issues...
--- End quote ---
And I'm completely confused now ;-)
Is there a 'small' compiler switch like {$UTF8+/-}?
--- End quote ---
There is a small define :
-dEnableUTF8RTL
My plan is replace it with -dDisableUTF8RTL for people who want to use the FPC system codepage string as default.
Without any defines a Lazarus project compiled with FPC 3.x will then use the new UTF-8 system.
See issue :
http://bugs.freepascal.org/view.php?id=26453
and its related issues. This is really nasty, it may be a compiler bug as Michl figured out.
Our new UTF-8 system solves those problems. It is still a hack but less of a hack than the currently used UTF-8 hack is.
Later when FPC, RTL and other libs are ready, we will implement a Delphi compatible UTF-16 support, too.
Before that, let's try to make things work without UTF-16.
Roland57:
--- Quote from: Ocye on October 14, 2015, 03:17:00 pm ---And now for Greek (Ελληνικά) ;D
--- End quote ---
Here you are.
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---{$codepage utf8} procedure Greetings(aLanguage: string);begin case aLanguage of 'English': WriteLn('Hello World'); 'Deutsch': WriteLn('Hallo Welt'); {'Français'}'Fran'#$C3#$A7'ais': WriteLn('Salute Monde'); {'Русский'}#$D0#$A0#$D1#$83#$D1#$81#$D1#$81#$D0#$BA#$D0#$B8#$D0#$B9: WriteLn('приве́т мир'); {'Ελληνικά'}#$CE#$95#$CE#$BB#$CE#$BB#$CE#$B7#$CE#$BD#$CE#$B9#$CE#$BA#$CE#$AC: WriteLn('Ελληνικά'); end;end; begin Greetings('English'); Greetings('Français'); Greetings('Русский'); Greetings('Ελληνικά'); ReadLn;end.
Ocye:
--- Quote from: JuhaManninen on October 14, 2015, 04:44:13 pm ---See issue :
http://bugs.freepascal.org/view.php?id=26453...
--- End quote ---
I always struggle with those issues. As a non-professional, who didn't really understand the codepage stuff, and usually coding on Linux it's one of the major obstacles to get the code working cross-plattform for multiple language.
--- Quote from: Roland Chastain on October 14, 2015, 08:55:21 pm ---
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---{$codepage utf8} {'Русский'}#$D0#$A0#$D1#$83#$D1#$81#$D1#$81#$D0#$BA#$D0#$B8#$D0#$B9: WriteLn('приве́т мир'); {'Ελληνικά'}#$CE#$95#$CE#$BB#$CE#$BB#$CE#$B7#$CE#$BD#$CE#$B9#$CE#$BA#$CE#$AC: WriteLn('Ελληνικά');
--- End quote ---
Hm... could indeed be a solution, at least temporarily. Thanks for the suggestion.
JuhaManninen:
--- Quote from: Ocye on October 16, 2015, 10:57:13 am ---
--- Quote from: JuhaManninen on October 14, 2015, 04:44:13 pm ---See issue :
http://bugs.freepascal.org/view.php?id=26453...
--- End quote ---
I always struggle with those issues. As a non-professional, who didn't really understand the codepage stuff, and usually coding on Linux it's one of the major obstacles to get the code working cross-plattform for multiple language.
--- End quote ---
Same here. That's why we have created the new "UTF-8 hack":
http://wiki.freepascal.org/Better_LCL_Unicode_Support
It works amazingly well. Yes it has issues, too, but they are predictable and understandable.
The bug report I mentioned happens only when the new UTF-8 system is NOT used.
--- Quote from: Ocye on October 16, 2015, 10:57:13 am ---Hm... could indeed be a solution, at least temporarily. Thanks for the suggestion.
--- End quote ---
No, a proper solution is the new UTF-8 system. Why don't you want to use it?
From your first post :
--- Quote ---Having the option -FcUTF8 set (RTL with UTF8 support), the compiler complains ...
--- End quote ---
You have misunderstood the meaning of -FcUTF8. It does not make RTL support UTF-8. It only makes the compiler assume that source files have UTF-8 encoding.
-dEnableUTF8RTL changes the default encoding of String type.
My plan is to remove -dEnableUTF8RTL and make the new UTF-8 system the default behavior for all Lazarus projects. As you have seen, String with system codepage + FPC 3.x + LCL with UTF-8 is a SWAMP.
Maybe I should do this change ASAP to avoid more confusion. There will be -dDisableUTF8RTL for people who must use system code page strings.
And, before anybody asks:
Delphi compatible UTF-16 support will be made later when FPC and its libs are ready.
This new UTF-8 system is much better than the old UTF-8 hack with all those UTF8... functions. It is less of a hack.
In fact it is almost Delphi compatible at source level for lots of code. Reading/writing non-UTF-8 streams or files or DBs need changes which must be documented somehow.
Navigation
[0] Message Index
[*] Previous page