Recent

Author Topic: UTF8 output messing up console  (Read 20295 times)

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
UTF8 output messing up console
« on: July 15, 2012, 02:44:27 pm »
Hi,

FPC 2.6.1 r21747

Earlier I had a problem getting Unicode/UTF8 output working on my Windows Vista console:
http://lazarus.freepascal.org/index.php/topic,13949

The following program works, but after it is finished, the console does not seem to recognize batch files or perhaps does not show any output. It does recognize commands such as dir and ipconfig.
Code: [Select]
program uniconsole;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}

uses
  {$IFDEF UNIX}
    {$IFDEF UseCThreads}
    cthreads,
    {$ENDIF}
  {Widestring manager needed for widestring support}
  cwstring,
  {$ENDIF}
  {$IFDEF WINDOWS}
  Windows, {for setconsoleoutputcp}
  {$ENDIF}
  Classes
  ;

var
UTF8TestString: string;

begin
{$IFDEF WINDOWS}
SetConsoleOutputCP(CP_UTF8);
{$ENDIF}
// File encoded as UTF8 without BOM
// The next line should print rose, wodka (Cyrillic) and ouzo (Greek)
UTF8TestString:= 'rosé, водка and ούζο';
writeln ('UTF8 test: ' + UTF8TestString);
// once this program is finished, my console is messed up:
// it won't find executables/batch files if i try to run them
end.

I've also tried saving the current console codepage in the beginning and resetting the console to that codepage, but that didn't seem to make any difference. (Will have to look to see if I can find that code again)

What am I doing wrong?
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

Leledumbo

  • Hero Member
  • *****
  • Posts: 8757
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: UTF8 output messing up console
« Reply #1 on: July 15, 2012, 02:47:12 pm »
Set the console output cp back to its previous value?

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #2 on: July 15, 2012, 03:29:39 pm »
Mmmm. Am I going crazy? This seems to work with FPC 2.6.1:
At first this seemed to work. However, my console is messed up again...

There's still something wrong  >:(

Aargggh  >:D
Code: [Select]
program uniconsole;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}

uses
  {$IFDEF UNIX}
    {$IFDEF UseCThreads}
    cthreads,
    {$ENDIF}
  {Widestring manager needed for widestring support}
  cwstring,
  {$ENDIF}
  {$IFDEF WINDOWS}
  Windows, {for setconsoleoutputcp}
  ctypes,
  {$ENDIF}
  Classes
  ;


var
{$IFDEF WINDOWS}
CurrentOutputCP: cuint;
{$ENDIF}
UTF8TestString: string;

begin
{$IFDEF WINDOWS}
// see
// http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx
CurrentOutputCP:=GetConsoleOutputCP;
SetConsoleOutputCP(CP_UTF8);
//Note: no use of input console: SetConsoleCP and GetConsoleCP?
{$ENDIF}
// File encoded as UTF8 without BOM
// The next line should print rose, wodka (Cyrillic) and ouzo (Greek)
UTF8TestString:= 'rosé, водка and ούζο';
writeln ('UTF8 test: ' + UTF8TestString);
{$IFDEF WINDOWS}
// reset console
SetConsoleOutputCP(CurrentOutputCP);
{$ENDIF}
end.
« Last Edit: July 15, 2012, 04:00:19 pm by BigChimp »
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #3 on: July 15, 2012, 04:01:50 pm »
Nope, still problems. I suppose I'll have to keep track of what I'm doing etc because at first it did work... then I changed a larger program (my twitter console demo) and ran it, now I'm seeing the same problem (batch files not being recognized).
Don't know what caused it exactly.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

Silvio Clécio

  • Guest
Re: UTF8 output messing up console
« Reply #4 on: July 15, 2012, 07:53:58 pm »
"try {$codepage utf8} or add an utf8 BOM because your constant is encoded in utf8.

Best regards,
Paul Ishenin".

See thread in:

http://www.mail-archive.com/fpc-devel@lists.freepascal.org/msg26232.html

Good luck. ;)

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #5 on: July 15, 2012, 08:18:49 pm »
@Paul, thanks I've read that before.
To clarify: I don't have any problems displaying the UTF8 characters in the program (as long as I made sure the source file was UTF8 encoded without BOM).

I also read this wiki section:
http://wiki.lazarus.freepascal.org/LCL_Unicode_Support#FPC_codepages
If I understand correctly, that advises not to use the coepage.

My problem is that after the program finishes, the console (and I think other open console windows) does not seem to find .cmd files anymore. Closing all the console windows and opening new ones seems to fix this behaviour.

Do you think e.g. adding a UTF8 BOM will help this behaviour?

Thanks
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

Silvio Clécio

  • Guest
Re: UTF8 output messing up console
« Reply #6 on: July 15, 2012, 08:25:20 pm »
Try a test, I think it will work.

Is best to use only the codepage directive to UTF8 support instead of many codes.

KpjComp

  • Hero Member
  • *****
  • Posts: 680
Re: UTF8 output messing up console
« Reply #7 on: July 15, 2012, 08:43:45 pm »
Well running batch files after works fine for me, but the output is incorrect.
I'm Windows 7, 64bit FPC 2.7.1.

But using WriteConsole, instead of Writeln works fine.

Code: [Select]
program consoleTest;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}

uses
  {$IFDEF UNIX}
    {$IFDEF UseCThreads}
    cthreads,
    {$ENDIF}
  {Widestring manager needed for widestring support}
  cwstring,
  {$ENDIF}
  {$IFDEF WINDOWS}
  Windows, {for setconsoleoutputcp}
  ctypes,
  {$ENDIF}
  Classes
  ;


var
{$IFDEF WINDOWS}
CurrentOutputCP: cuint;
{$ENDIF}
UTF8TestString: string;
bw:Dword;

begin
{$IFDEF WINDOWS}
// see
// http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036%28v=vs.85%29.aspx
CurrentOutputCP:=GetConsoleOutputCP;
SetConsoleOutputCP(CP_UTF8);
//Note: no use of input console: SetConsoleCP and GetConsoleCP?
{$ENDIF}
// File encoded as UTF8 without BOM
// The next line should print rose, wodka (Cyrillic) and ouzo (Greek)
//0xEF,0xBB,0xBF
UTF8TestString:= 'rosé, водка and ούζο';
WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE),@UTF8TestString[1],length(UTF8TestString),bw,nil);
{$IFDEF WINDOWS}
// reset console
SetConsoleCP(CurrentOutputCP);
{$ENDIF}
end.

ps.  Make sure you set your consoles Font to Lucida Console.

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #8 on: July 15, 2012, 08:52:30 pm »
It is set to Lucida Console ;)

My program output indeed fails when compiled with FPC 2.7.1, but works correctly with FPC 2.6.1.

Pfff.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #9 on: July 15, 2012, 09:05:30 pm »
Try a test, I think it will work.

Is best to use only the codepage directive to UTF8 support instead of many codes.
Sorry, Silvio, missed your reply.

Do you mean the {$codepage utf8} is a replacement for the SetConsoleOutputCP(CP_UTF8) calls?
Nope, doesn't work - both FPC 2.6.1 and 2.7.1 give garbled output - although different characters...:
Code: [Select]
{$codepage utf8}
program uniconsole2;

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}

uses
  {$IFDEF UNIX}
    {$IFDEF UseCThreads}
    cthreads,
    {$ENDIF}
  {Widestring manager needed for widestring support}
  cwstring,
  {$ENDIF}
  {$IFDEF WINDOWS}
  Windows, {for setconsoleoutputcp}
  ctypes,
  {$ENDIF}
  Classes
  ;


var
UTF8TestString: string;

begin
// File encoded as UTF8 without BOM
// The next line should print rose, wodka (Cyrillic) and ouzo (Greek)
UTF8TestString:= 'rosé, водка and ούζο';
writeln ('UTF8 test: ' + UTF8TestString);
// once this program is finished, my console is messed up:
// it won't find executables/batch files if i try to run them
end.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

KpjComp

  • Hero Member
  • *****
  • Posts: 680
Re: UTF8 output messing up console
« Reply #10 on: July 15, 2012, 09:08:05 pm »
Quote
Pfff.

I assume writeConsole will work for both.

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #11 on: July 15, 2012, 09:17:58 pm »
Could be ;)
It's just that it's ridiculous to have to pull these tricks just to put letters on the screen. There must be an easier way...
Also, I have a sneaking suspicion WriteConsole may be windows-specific?!?!

There's probably been some improvement/change in FPC trunk that I missed... and you just might need to set another compiler switch or something...

Edit: of course, that still leaves me with my console problems after running e.g. my more complicated twitter test program... but perhaps best get the basics cleared up first.
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

KpjComp

  • Hero Member
  • *****
  • Posts: 680
Re: UTF8 output messing up console
« Reply #12 on: July 15, 2012, 09:29:29 pm »
Quote
I have a sneaking suspicion WriteConsole may be windows-specific?!?!

I'm pretty sure it is.. yes.
Like your example, use the good old $ifdef

Code: [Select]
{$ifdef WINDOWS}
procedure WriteLnUTF8(s:string);
var
  bw:dword;
begin
  s := s + #13#10;
  WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE),@s[1],length(s),bw,nil);
end;
{$else}
procedure WriteLnUTF8(s:string);
begin
  writeln(s);
end;
{$endif} 

Quote
Edit: of course, that still leaves me with my console problems after running

Maybe it's related!!!..  I'm not getting the problem so hard to tell.

Quote
It's just that it's ridiculous to have to pull these tricks

All the years I've used Delphi, that's often been the case.   %)

BigChimp

  • Hero Member
  • *****
  • Posts: 5740
  • Add to the wiki - it's free ;)
    • FPCUp, PaperTiger scanning and other open source projects
Re: UTF8 output messing up console
« Reply #13 on: July 16, 2012, 10:23:16 am »
Thanks KpjComp, something like that would indeed be a more palatable alternative.

Even after looking at
http://wiki.lazarus.freepascal.org/User_Changes_Trunk
I still don't get though why valid code in 2.6 won't run correctly on trunk, so I've asked the same question on the mailing list:
http://www.mail-archive.com/fpc-pascal@lists.freepascal.org/msg29074.html
Want quicker answers to your questions? Read http://wiki.lazarus.freepascal.org/Lazarus_Faq#What_is_the_correct_way_to_ask_questions_in_the_forum.3F

Open source including papertiger OCR/PDF scanning:
https://bitbucket.org/reiniero

Lazarus trunk+FPC trunk x86, Windows x64 unless otherwise specified

KpjComp

  • Hero Member
  • *****
  • Posts: 680
Re: UTF8 output messing up console
« Reply #14 on: July 16, 2012, 11:29:01 am »
@Bigchimp, I'm not getting any issues with running Batch Files after running the console App.

But one idea,  I wonder if on your system the echo mode gets changed.

Could you try typing ->
Code: [Select]
echo on
After running your console App, and then run your batch files.

If the above does work, you could then use ->
var oldMode:dword;
->before
GetConsoleMode(GetStdHandle(STD_OUTPUT_HANDLE),oldmode); 
->after
SetConsoleMode(GetStdHandle(STD_OUTPUT_HANDLE),oldMode);

Or even more simple, could one of your batch files be altering the echo mode?
« Last Edit: July 16, 2012, 11:41:29 am by KpjComp »

 

TinyPortal © 2005-2018