Hello.
I've being trying to solve some problems related to wrong encoding in displayed text on my applications, and I've found some very confusing behavior. I think it might be a bug, but do not know where it might be, if it is a bug at all.
I've attached a MRE of the problem, and also a .gif of the MRE running. The gif is from a Windows 11 system, but the behavior also happens in a Debian 12 system.
I have a source file encoded as ISO 8859-1. This file has string literals with non-ascii characters. If I try to assign these literals
directly to e.g. a Label caption, they're not displayed correctly, even including a {$CODEPAGE 8859-1} directive in the source. The confusing bits are two:
- If I pass that same literal through a simple Format('%s', [...]) call, then it is displayed correctly!
- If I ask the compiler if the two strings (before and after the Format) are equal, it says they are!
Expressing the above as code:
{$CODEPAGE 8859-1}
unit Unit1;
// ...
begin
Label1.Caption := 'Acentuação'; // Displays as 'Acentua?' (incorrect)
Label1.Caption := Format('%s', ['Acentuação']); // Displays as 'Acentuação' (correct)
if 'Acentuação' = Format('%s', ['Acentuação']) then
ShowMessage('But they are the same!'); // This statement executes!
end;
P.S. Some background on why I must use {$CODEPAGE 8859-1} and
cannot just convert my sources to UTF8: This same source files must also be compilable via Delphi 7 and Kylix 3. We're doing a gradual migration from those two to FPC, and since we cannot afford to make the migration all at once (hundreds of millions LOC) we need the same sources to (at least for a while) compile and execute on all 3 systems at once. In the worst case scenario I'm ready to wrap all calls to UI elements in a function to deal with encoding differences between the 3, but I want to avoid that if at all possible.