Hi,
I asked this on ML, but got no reply, so I'm trying here now.
Consider this code:
{$codepage utf8}
{$mode objfpc}
{$H+}
uses
SysUtils, Classes;
var
SL: TSTringList;
S: String;
begin
writeln('DefaultSystemCodePage = ',DefaultSystemCodePage);
SL := TStringList.Create;
{$if fpc_fullversion > 30200}
SL.WriteBom := False;
{$endif}
SL.SkipLastLineBreak := True;
S := 'ä'; //S has CodePage CP_UTF8
SL.Add(S);
SL.SaveToFile('slU.txt'{$if fpc_fullversion > 30200}, TEncoding.UTF8{$endif});
SL.SaveToFile('slA.txt'{$if fpc_fullversion > 30200}, TEncoding.ANSI{$endif});
SL.Free;
end.
Tested with fpc trunk (form a few days ago).
It outputs:
DefaultSystemCodePage = 1252
(I'm on Windows as you might have guessed)
The file slA.txt contains the bytes C3 A4 (which is ä in UTF8 encoding)
The file slU.txt contains the bytes C3 83 C2 A4
I struggle to understand why.
What is the codepage of the stringlist's internal list of strings (array of TStringItem's)?
It seems that the stringlist considers it's internal TStringItem.FString that has #$C3A#$A4 to have a codepage of CP_ACP (always)?
I just tested this variant:
In a new unit which does have {$codepage utf8} declare a const like
In the main sourcefile remove the {$codepage utf8}.
In the uses clause of the main sourcefile add the newly created unit.
Replace the line
with
Now build and run.
The file slA.txt will contain the single byte E4 ('ä' in codepage 1252).
The file slU.txt will conatin the bytes C3 A4 ('ä' in codepage utf8).
Is this a bug?
Bart