Forum > Windows

Questions of new Strings in FPC 3.0

(1/4) > >>

Michl:
Hi,

I have read a lot about the new strings and understood most of it and the ideas behind it. I've build a lot of test cases for my self and test a lot of conversions of strings, but I still have some questions.

Shortstring: The code page of a shortstring is implicitly CP_ACP and hence will always be equal to the current value of DefaultSystemCodePage. So I make a test:
--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program project1; //{$codepage cp1252}//{$mode ObjFPC}{$H+}//{$modeswitch systemcodepage} const  StrCP1252 = #$80#$C4#$D6#$8C#$A5;// CP1252     €   Ä   Ö   Œ   ¥// CP437      Ç   ─   ╓   î   Ñ// 1. output  ?   Ä   Ö   O   ¥// 2. output  Ç   ─   Í   î   Ñ var  s: String; begin  writeln(DefaultSystemCodePage);  s := StrCP1252;  writeln(s);               // expected: €ÄÖŒ¥   get: ?ÄÖO¥  writeln(StrCP1252);       // expected: €ÄÖŒ¥   get: Ç─ÍîÑ  writeln(Char(#$80), Char(#$C4), Char(#$D6), Char(#$8C), Char(#$A5));end.  
The output is:
--- Quote ---1252
?ÄÖO¥
Ç─ÍîÑ
Ç─ÍîÑ
--- End quote ---

My Codepage is 1252. The output of ShortStrings is more similar to the codepage 437. With the assigning of the ShortString to a String, I got nearly a CP1252 string.

It doesn´t matter if I define/not define:
--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---//{$codepage cp1252}//{$mode ObjFPC}{$H+}//{$modeswitch systemcodepage}
I expected, if I define copepoints (Bytes > 127) of a CP1252 string, I got the characters shown here: https://en.wikipedia.org/wiki/Windows-1252

Why are in the output other characters then explained in the wiki (I got ? instead € and O instead Œ)?
Why is WriteLn(SomeShortString) not the same as WriteLn(SomeString)?

My system: Windows 7, 64bit, FPC 3.1.1 32bit r32092
Compileroptions: -MObjFPC -Scghi -O1 -g -gl -l -vewnhibq -Filib\i386-win32 -Fu. -FUlib\i386-win32

Cyrax:

--- Quote from: Michl on October 29, 2015, 10:21:52 am ---...
Compileroptions: -MObjFPC -Scghi -O1 -g -gl -l -vewnhibq -Filib\i386-win32 -Fu. -FUlib\i386-win32

--- End quote ---

You are compiling your test project via Lazarus. That is why there is no difference if you undefine those compiler conditional settings. You should try compiling your test program via command line.

Michl:
You are right, I can't deactivate such FPC conditional settings. If I compile from command line the result of enabled/disabled {$mode ObjFPC}{$H+} is a other, cause in one case "s" is a ShortString on the other "s" is a String. Thank you for that hint.

But my questions are still here. Can anyone give me a hint for:
Why are in the output other characters then explained in the wiki (I got ? instead € and O instead Œ)?
Why is WriteLn(SomeShortString) not the same as WriteLn(SomeString) with {$mode ObjFPC}{$H+}?

Cyrax:
WriteLn is compiler intrinsic subroutine which will be replaced during compile time to calls real subroutine. In case of WriteLn(SomeShortString) there will be type specific call and call to subroutine which will print EOL to standard output. You will see this yourself by examining disassembler output via  View->Debug Windows->Assembler in Lazarus.

Michl:
OK I understand.

I make a new test:
--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program project1; //{$codepage cp1252}{$mode ObjFPC}{$H+} var  s: String;  s2: ShortString;  f: file of byte;  ftext: TextFile; begin  AssignFile(f, 'test.txt');  Rewrite(f);  write(f, $80);  write(f, $C4);  write(f, $D6);  write(f, $8C);  write(f, $A5);  CloseFile(f);   AssignFile(ftext, 'test.txt');  Reset(ftext);  Read(ftext, s);  Reset(ftext);  Read(ftext, s2);  CloseFile(ftext);  writeln(s);   // expected: €ÄÖŒ¥   Console Output: ?ÄÖO¥  writeln(s2);  // expected: €ÄÖŒ¥   Console Output: Ç─ÍîÑ   AssignFile(ftext, 'teststring.txt');  Rewrite(ftext);  write(ftext, s);  CloseFile(ftext);   AssignFile(ftext, 'testshortstring.txt');  Rewrite(ftext);  write(ftext, s2);  CloseFile(ftext);end.
If I now inspect the three files with Windows Notepad. They all have the same and correct content (€ÄÖŒ¥). Now it is seems so, that there is no difference between Write(SomeShortString) and Write(SomeString) also for WriteLn.

If I compile the project "project1.exe > testconsole.txt" there are the wrong chars in that file.

So only the Console output is wrong (the output for the ShortString and/or the String). Should I report that to the bugtracker or is it a known problem?

Navigation

[0] Message Index

[#] Next page

Go to full version