Recent

Author Topic: UTF-8  (Read 12331 times)

baltas99

  • Newbie
  • Posts: 5
UTF-8
« on: November 30, 2014, 04:01:26 pm »
I have this command in my code Writeln(UTF8ToConsole('Το πραγραμμα εκτελεστηκε επιτυχως!Πατηστε οποιοδηποτε κουμπι για εξοδο')); The language is greek.I have attached FileUtils and it runs with no errors but in cmd i see ?? ????????? i mean i see question marks instead of Greek letters.When i changed command to  Writeln(UTF8ToConsole('Компилируйся сцуко)); Russian language it run again smoothly and in cmd i saw normal Russian text.Why greek doesnt work?
PS i use windows8.1 professional 64bit with amd fx6300 processor

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: UTF-8
« Reply #1 on: November 30, 2014, 05:21:08 pm »
Try this:
Code: [Select]
program project5;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}
{$codepage utf-8}

uses
  Windows;

var
  PreviousValue: UINT;
  s: String;

begin
  PreviousValue := GetConsoleOutputCP;

  //Using UTF8
  if not SetConsoleOutputCP(CP_UTF8) then
  begin
    WriteLn('SetConsoleOutputCP(CP_UTF8) Failed!');
    exit;
  end;

  S := 'Το πραγραμμα εκτελεστηκε επιτυχως!Πατηστε οποιοδηποτε κουμπι για εξοδο';

  {$ifdef FPC_HAS_CPSTRING}  //<--- Based on your compiler
  if GetTextCodePage(Output)<>StringCodePage(s) then
    SetTextCodePage(Output, StringCodePage(s));
  {$endif FPC_HAS_CPSTRING}

  WriteLn(s);

  SetConsoleOutputCP(PreviousValue);
end.

Edit:
Sorry, forgot to answer your question:
Why greek doesnt work?
By default your Windows uses some Russian-supported ANSI code page in console windows. The previous code uses UTF8. Alternatively you can use Greek-supported ANSI code page, but you'll have to convert your strings from UTF8 to CP_DOSGreek before passing them to the console.
« Last Edit: November 30, 2014, 05:30:41 pm by engkin »

baltas99

  • Newbie
  • Posts: 5
Re: UTF-8
« Reply #2 on: November 30, 2014, 05:50:12 pm »
Thank you for your help.However i had had the same problem http://postimg.org/image/st3pztoet/
If i wil simply make it with writeln('GREEK TEXT') and run it in linux will it work?
« Last Edit: November 30, 2014, 05:53:50 pm by baltas99 »

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: UTF-8
« Reply #3 on: November 30, 2014, 06:02:24 pm »
Does this work for you:
Code: [Select]
program project6;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}
{$codepage utf-8}

uses
  Windows;

var
  PreviousValue: UINT;
  s: String;
  c: DWORD;

begin
  PreviousValue := GetConsoleOutputCP;

  //Using UTF8
  if not SetConsoleOutputCP(CP_UTF8) then
  begin
    WriteLn('SetConsoleOutputCP(CP_UTF8) Failed!');
    exit;
  end;

  S := 'Το πραγραμμα εκτελεστηκε επιτυχως!Πατηστε οποιοδηποτε κουμπι για εξοδο';

  if not WriteFile(StdOutputHandle,s[1],Length(s),c,nil) then
     WriteLn('WriteFile Failed! ');

  {$ifdef FPC_HAS_CPSTRING}  //<--- Based on your compiler
  if GetTextCodePage(Output)<>StringCodePage(s) then
    SetTextCodePage(Output, StringCodePage(s));
  {$endif FPC_HAS_CPSTRING}

  //WriteLn(s);

  SetConsoleOutputCP(PreviousValue);
end.

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #4 on: November 30, 2014, 06:02:38 pm »
This will work
Code: [Select]
program project_greek2;

{$mode objfpc}{$H+}


    var s:WideString;
        s1:String;
    const c='a - α';
      const c1='α';
begin
  s:='a  -  α';
  s1:='a  -  α';
  WriteLn('a - α');
  WriteLn(s);
  WriteLn(s1);
  WriteLn(c);
  WriteLn(c1);
  WriteLn( 'α' );
  WriteLn('==========================');

  WriteLn(Utf8ToAnsi('a - α'));
  WriteLn(Utf8ToAnsi(s));
  WriteLn(Utf8ToAnsi(s1));
  WriteLn(Utf8ToAnsi(c));
  WriteLn(Utf8ToAnsi(c1));
  WriteLn( Utf8ToAnsi('α') );
  ReadLn;
end.
Edit***:is working if file encoding is utf-8

@engkin: some kind of double/triple conversion is done when code page is used
« Last Edit: November 30, 2014, 06:05:28 pm by Never »
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

baltas99

  • Newbie
  • Posts: 5
Re: UTF-8
« Reply #5 on: November 30, 2014, 06:12:45 pm »
Thank you again but it didnt work. http://postimg.org/image/gmc63xd4r/ .

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #6 on: November 30, 2014, 06:16:04 pm »
do you have Greek instaled as a second input language? from your regional settings?
« Last Edit: November 30, 2014, 06:21:44 pm by Never »
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #7 on: November 30, 2014, 06:33:38 pm »
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: UTF-8
« Reply #8 on: November 30, 2014, 09:18:58 pm »
@Never, did you change the font?

First you have to choose a font like Lucida Console.

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #9 on: November 30, 2014, 09:30:13 pm »
@engkin
yes i changed the fonts also
but the efect is the same
do you want me to upload screenshots for these also?
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: UTF-8
« Reply #10 on: November 30, 2014, 10:01:31 pm »
do you want me to upload screenshots for these also?
@Never, no. You already did.

Can you try this code?
Code: [Select]
program project5;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}
{$codepage utf-8}

uses
  Windows;

var
  PreviousValue: UINT;
  s: String;

  procedure Test;
  begin
    WriteLn('GetConsoleOutputCP: ', GetConsoleOutputCP);
    WriteLn('GetTextCodePage(Output): ', GetTextCodePage(Output));
    WriteLn('StringCodePage(s): ', StringCodePage(s));
  end;

begin
  s := 'a - α';
  Test;

  WriteLn('WriteLn: ',s);
  WriteLn('');

  PreviousValue := GetConsoleOutputCP;

  //Using UTF8
  if not SetConsoleOutputCP(CP_UTF8) then
  begin
    WriteLn('SetConsoleOutputCP(CP_UTF8) Failed!');
    exit;
  end;

  Test;
  WriteLn('WriteFile: ');
  if not WriteFile(StdOutputHandle,s[1],Length(s),c,nil) then
     WriteLn('WriteFile Failed! ');
  WriteLn('');
  WriteLn('');

  if GetTextCodePage(Output)<>StringCodePage(s) then
    SetTextCodePage(Output, StringCodePage(s));

  Test;
  WriteLn('WriteLn: ', s);

  SetConsoleOutputCP(PreviousValue);
end.

Try it on a new console.

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #11 on: November 30, 2014, 10:11:46 pm »
@engkin

used another example of yours with

Code: [Select]

{$calling stdcall} 
 function GetConsoleOutputCP:UINT; external 'kernel32' name 'GetConsoleOutputCP';
  function SetConsoleOutputCP(wCodePageID:UINT):WINBOOL; external 'kernel32' name 'SetConsoleOutputCP';
  function WriteFile(hFile: THandle; const Buffer; nNumberOfBytesToWrite: DWORD; var lpNumberOfBytesWritten: DWORD; lpOverlapped: pointer): BOOL; external 'kernel32' name 'WriteFile';
and worked !!!

Edit***:and now is working with windows in uses too!!


« Last Edit: November 30, 2014, 10:13:23 pm by Never »
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: UTF-8
« Reply #12 on: November 30, 2014, 10:25:25 pm »
Try it on a new console.

Edit:
The only explanation I have now is that:
1- SetConsoleOutputCP from unit Windows is not working. <-- does not make sense.

OR

2- CP_UTF8 holds a wrong value, not 65001. <-- does not make sense either.  :-\

« Last Edit: November 30, 2014, 10:30:41 pm by engkin »

Never

  • Sr. Member
  • ****
  • Posts: 409
  • OS:Win7 64bit / Lazarus 1.4
Re: UTF-8
« Reply #13 on: November 30, 2014, 10:36:12 pm »
can't find [ GetTextCodePage ]
added lazutils and lazutf8 but still can't find them
added this
Code: [Select]

type TSystemCodePage=word;
....
 function GetTextCodePage(var T: Text): TSystemCodePage;
  begin
  {$if defined(FPC_HAS_CPSTRING) and defined(FPC_HAS_FEATURE_ANSISTRINGS)}
  GetTextCodePage:=TextRec(T).CodePage;
  {$else}
  GetTextCodePage:=0;
  {$endif}
  end; 

and for
[ StringCodePage ] didn't find something more except

function StringCodePage(const S : RawByteString): TSystemCodePage; overload;
StringCodePage<<--- is used in lazutils/lazutf8
missing files from my instalation?
searched lazarus instalation directory with find in files app
Νέπε Λάζαρε λάγγεψων οξωκά ο φίλοσ'ς αραεύσε

Bart

  • Hero Member
  • *****
  • Posts: 5275
    • Bart en Mariska's Webstek
Re: UTF-8
« Reply #14 on: November 30, 2014, 10:47:29 pm »
I gues it's in fpc trunk?

Bart

 

TinyPortal © 2005-2018