Forum > General

Accented characters in concatenated strings

(1/2) > >>

Roland57:
Hello!

I try to display accented characters in a Windows console program. When the string is concatenated to another, the accented characters isn't displayed correctly. Why ?

Here is my code. I compiled it with FPC 3.0.0.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---begin  WriteLn('r'#130'pertoire');  WriteLn('r'#130'pertoir' + 'e');end.

engkin:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---{$codepage UTF8} begin  WriteLn('répertoire');  WriteLn('répertoir' + 'e');end. 

marcov:
Prints two times the same here. (fpc 3.0 on the win32 console)

possible reasons:
- settings relating to sourcecode encoding
- some lazarus feature that changes the default encoding.

engkin:
The first one is written using fpc_write_text_shortstr, no codepage conversion is involved. You'll see the correct letter if your console output codepage has é for #130.

The second one is written using fpc_Write_Text_AnsiStr, a codepage conversion happens from DefaultSystemCodePage to TextRec(Output).CodePage:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---  WriteLn('DefaultSystemCodePage: ', DefaultSystemCodePage);  WriteLn('TextRec(Output).CodePage: ', TextRec(Output).CodePage);
This conversion could corrupt the letter based on the two codepages.

Roland57:
@engkin, marcov

Thank you for your answers.

Indeed, with {$codepage UTF8} the result is correct. (The source code encoding is UTF-8 without BOM.)

But I get a warning: Implicit string type conversion from "AnsiString" to "UnicodeString". It isn't really a problem (since the result is correct), but I wonder what I should write to avoid the warning.

Here is my full code:


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program touchdirectory;{$codepage UTF8} uses  SysUtils, DateUtils, Process; const  TOUCH = 'C:\BCC101\bin\touch.exe';// https://www.embarcadero.com/fr/free-tools/ccompiler var  year, month, day, hour, minute, second, millisecond: word;  stamp, path: string; begin  if (ParamCount = 1) and DirectoryExists(ParamStr(1)) then  begin    path := ParamStr(1) + '\';    WriteLn('Traitement du répertoire "' + path + '".');  end else  begin    path := '';    WriteLn('Traitement du répertoire courant.');  end;    DecodeDateTime(Now(), year, month, day, hour, minute, second, millisecond);  stamp := Format('%0.2d%0.2d%0.2d%0.2d%0.2d', [month, day, hour, 0, year mod 100]);    with TProcess.Create(nil) do  begin    Executable := TOUCH;    Parameters.Add('-d' + stamp);    Parameters.Add('-D');    Parameters.Add('-s');    Parameters.Add('-v');    Parameters.Add(path + '*.*');    Options := Options + [poWaitOnExit];    Execute;    Free;  end;    Write('Appuyez sur Entrée pour continuer... ');  ReadLn;end. 

Navigation

[0] Message Index

[#] Next page

Go to full version