Recent

Author Topic: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]  (Read 24407 times)

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #30 on: November 25, 2014, 08:17:14 pm »
That works for me if I select Lucinda and add "{$codepage utf-8}" to the source.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #31 on: November 25, 2014, 08:43:50 pm »
That works for me if I select Lucinda and add "{$codepage utf-8}" to the source.
That's a good idea to make sure the file is using UTF8, Thanks. I hope it works for the OP as well.  ::)

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #32 on: November 25, 2014, 08:50:25 pm »
this is working with out any other command for me on win

Code: [Select]
{$mode objfpc}{$H+}
WriteLn(UTF8ToSys('α'));

Edit***: the file encoding in editor FileSettings/Encoding is utf-8
« Last Edit: November 25, 2014, 08:55:54 pm by Never »
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #33 on: November 25, 2014, 09:48:02 pm »
Never: That means the alpha has a representation in your current character set.  Other characters might not.

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #34 on: November 25, 2014, 11:12:56 pm »
@marcov i agree with you
all greek letters are printed fine on my system
not only 'α'
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #35 on: November 26, 2014, 03:32:46 am »
this is working with out any other command for me on win

Code: [Select]
{$mode objfpc}{$H+}
WriteLn(UTF8ToSys('α'));
It does not work here for two reasons:
1-On my system UTF8ToSys('α') will not produce the correct value, because my system has a different code page than yours.
2-I'm using FPC 2.7.1 with FPC_HAS_CPSTRING defined. This means that if the code page of a string is different than the Output record's code page then a conversion happens.

The compiler translates WriteLn to a few calls to fpc_Write_Text_***.
Compare for instance fpc_Write_Text_AnsiStr on the trunk:
Code: [Select]
Procedure fpc_Write_Text_AnsiStr (...
...
  {$if defined(FPC_HAS_CPSTRING) and defined(FPC_HAS_FEATURE_ANSISTRINGS)}
  if slen > 0 then
    if TextRec(f).CodePage<>TranslatePlaceholderCP(StringCodePage(S)) then
    begin
      a:=fpc_AnsiStr_To_AnsiStr(S,TextRec(f).CodePage);  //<---------This conversion
      fpc_WriteBuffer(f,PAnsiChar(a)^,Length(a));

with fpc_Write_Text_AnsiStr in 2.6.4:
Code: [Select]
Procedure fpc_Write_Text_AnsiStr (...
...
  if slen > 0 then
    fpc_WriteBuffer(f,PChar(S)^,SLen);

Using DOS Greek code page (737), I can type that letter as a single character if I know its correct value in this code page (α is #$98):
Code: [Select]
const
  CP_DosGreek = 737;
...
  if SetConsoleOutputCP(CP_DosGreek) then
    WriteLn(#$98);//Utf8TpSys('α')

Notice that SetConsoleOutputCP will not change the code page in the Output record. Maybe setdefaultcodepage mentioned by Marcov does the trick, but I don't know where to find it.

I can also write that letter using CP_UTF8:
Code: [Select]
  PreviousValue := GetConsoleOutputCP;

  //Using UTF8
  WriteLn('Using CP_UTF8');
  if not SetConsoleOutputCP(CP_UTF8) then
  begin
    WriteLn('SetConsoleOutputCP(CP_UTF8) Failed!');
    exit;
  end;

  S := 'α';

  {$ifdef FPC_HAS_CPSTRING}  //<--- Based on your compiler
  WriteLn('Check codepages: ','GetTextCodePage(Output): ', GetTextCodePage(Output),', StringCodePage(s): ', StringCodePage(s));
  WriteLn();
  WriteLn('This may not work: ',s);

  if GetTextCodePage(Output)<>StringCodePage(s) then
  begin
    WriteLn();
    WriteLn('Changing Output codepage from ',GetTextCodePage(Output),' to ',StringCodePage(s));
    SetTextCodePage(Output, StringCodePage(s));
  end;
  {$endif FPC_HAS_CPSTRING}

  WriteLn('This works: ',s);//Now it works!

  SetConsoleOutputCP(PreviousValue);


marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #36 on: November 27, 2014, 09:36:18 am »
DefaultSystemCodePage is the codepage that codepage 0 translates to. So if you set it to CP_UTF8, write an utf8string, and do the things (earlier in this thread) to prepare stdout for utf8, this would work in 2.7.1.

In fact, if you watched Mattias mails over the last few days, Lazarus is thinking going in that direction for the coming FPC 2.8 iteration.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #37 on: November 27, 2014, 04:04:33 pm »
DefaultSystemCodePage is the codepage that codepage 0 translates to. So if you set it to CP_UTF8, write an utf8string, and do the things (earlier in this thread) to prepare stdout for utf8, this would work in 2.7.1.

Like this?
Code: [Select]
var
  PreviousValue: UINT;
  S: UTF8String;

begin
  PreviousValue := GetConsoleOutputCP;

  //Using UTF8
  DefaultSystemCodePage := CP_UTF8;

  WriteLn('Using CP_UTF8');
  if not SetConsoleOutputCP(CP_UTF8) then
  begin
    WriteLn('SetConsoleOutputCP(CP_UTF8) Failed!');
    exit;
  end;

  S := '║ A - α - ♥ ║';

  WriteLn('DefaultSystemCodePage: ', DefaultSystemCodePage);
  WriteLn('StringCodePage(S): ', StringCodePage(S));

  WriteLn('WriteLn(s): ', S);

  SetConsoleOutputCP(PreviousValue);
end.

This does *not* work on 2.7.1/Win32 because there is a hidden conversion to TextRec(Output).CodePage.

When a console app is started five text records are initialized in a call to SysInitStdIO:
Code: [Select]
procedure SysInitStdIO;
..
     OpenStdIO(Input,fmInput,StdInputHandle);
     OpenStdIO(Output,fmOutput,StdOutputHandle);//<------
     OpenStdIO(ErrOutput,fmOutput,StdErrorHandle);
     OpenStdIO(StdOut,fmOutput,StdOutputHandle);
     OpenStdIO(StdErr,fmOutput,StdErrorHandle);
...

All of fmOutput get the same code page from the registry:
Code: [Select]
procedure OpenStdIO(var f:text;mode:longint;hdl:thandle);
...        TextRec(f).CodePage:=WideStringManager.GetStandardCodePageProc(scpConsoleOutput);

The initial code page value comes from:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage\OEMCP

Subsequently, WriteLn uses TextRec(Output).CodePage which is *not* going to be CP_UTF8 and corrupts the data before writing it.

That's why I added:
Code: [Select]
  {$ifdef FPC_HAS_CPSTRING}
...
    SetTextCodePage(Output, StringCodePage(S));
...
  {$endif FPC_HAS_CPSTRING}

Sorry if I misunderstood what you said.

In fact, if you watched Mattias mails over the last few days, Lazarus is thinking going in that direction for the coming FPC 2.8 iteration.
I think you are referring to Codepage aware RTL and Trying to understand the wiki-Page "FPC Unicode support".

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #38 on: November 27, 2014, 05:08:47 pm »
So you would need to close and reopen stdin and stdout again? I don't know if the classic way to do that is still allowed :-)

Code: [Select]
close(input); assign(input,''); reset(input);
close(output); assign(output,''); reset(output);

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: [HELP!!] - Pacal - Displaying Unicode [Showing UTF8 in console pascal]
« Reply #39 on: November 27, 2014, 05:44:34 pm »
So you would need to close and reopen stdin and stdout again? I don't know if the classic way to do that is still allowed :-)

Code: [Select]
close(input); assign(input,''); reset(input);
close(output); assign(output,''); reset(output);
Nope, it is enough to call SetTextCodePage(Output, CP_UTF8)

 

TinyPortal © 2005-2018