Recent

Author Topic: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?  (Read 2210 times)

drknopp

  • Newbie
  • Posts: 2
Hi,

I am trying to write unicode characters to the console using WriteConsoleOutput from the windows unit. I have code something like this:

Code: Pascal  [Select]
  1. uses
  2.   windows;
  3. var
  4. Buffer : array of TCharInfo;
  5. hConsole : THandle;
  6. ...
  7. hConsole := GetStdHandle(STD_OUTPUT_HANDLE);
  8. ...
  9. procedure Draw(x, y : Integer; c : UnicodeChar; col : Word);
  10. begin
  11.   Buffer[y*console.Width + x].UnicodeChar := c;
  12.   Buffer[y*console.Width + x].Attributes := col;
  13. end;
  14.  
  15. procedure Update();
  16. const
  17.   bufCoord : TCoord = (X:0; Y:0);
  18.   bufSize : TCoord = (Width:80; Height:30);
  19.   rectWindow : SMALL_RECT = (Left : 0; Top: 0; Right: 79; Bottom: 29);
  20. begin
  21.   WriteConsoleOutput(hConsole, @Buffer[0], bufSize, bufCoord, rectWindow);
  22. end;

The full code is of course somewhat longer but I am essentially following this C/C++ code to set up the console: https://github.com/OneLoneCoder/videos/blob/master/olcConsoleGameEngine.h

Now this works fine with ASCII characters, however it appears that characters wrap around once you hit 256. That is Drawing a character 65 ('A') is equal to drawing character 256+65, which suggests to me that this console is still drawing in ASCII mode, not in UnicodeChar mode. I am trying to draw the block characters from the above github link PIXEL_SOLID = 0x2588; PIXEL_THREEQUARTERS = 0x2593; ... but so far with no success. To draw a solid block I am trying:
Code: Pascal  [Select]
  1. Draw(x, y, #$2588, $0c0a);

I do have
Code: Pascal  [Select]
  1. {$mode objfpc}{$H+}
  2. {$codepage utf8}    
on the top of my files and I experimented with setting the code page with calls like
Code: Pascal  [Select]
  1. SetConsoleOutputCP(DefaultSystemCodePage);
  2. SetTextCodePage(Output, DefaultSystemCodePage);
However, so far no success whatsoever.

Any ideas why my console only likes to show ASCII?

engkin

  • Hero Member
  • *****
  • Posts: 2513
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #1 on: February 19, 2019, 12:36:44 am »
What value do you have for DefaultSystemCodePage?

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 680
    • Lebeau Software
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #2 on: February 19, 2019, 03:20:56 am »
The full code is of course somewhat longer but I am essentially following this C/C++ code to set up the console: https://github.com/OneLoneCoder/videos/blob/master/olcConsoleGameEngine.h

Now this works fine with ASCII characters, however it appears that characters wrap around once you hit 256.

That would make perfect sense if WriteConsoleOutput() maps to WriteConsoleOutputA() instead of to WriteConsoleOutputW(). Which is likely the case given that FreePascal still uses Ansi strings and Ansi RTL/Win32 functions by default.

If you look at the C/C++ code you linked to, it has this compile-time check in it:

Quote
Code: C  [Select]
  1. #ifndef UNICODE
  2. #error Please enable UNICODE for your compiler! VS: Project Properties -> General -> \
  3. Character Set -> Use Unicode. Thanks! - Javidx9
  4. #endif

That means the C/C++ code requires WriteConsoleOutput() to map to WriteConsoleOutputW(), not to WriteConsoleOutputA().  So, in your code, just call WriteConsoleOutputW() directly, then it should write your UnicodeChar values as expected.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

drknopp

  • Newbie
  • Posts: 2
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #3 on: February 19, 2019, 07:56:26 am »
That would make perfect sense if WriteConsoleOutput() maps to WriteConsoleOutputA() instead of to WriteConsoleOutputW(). Which is likely the case given that FreePascal still uses Ansi strings and Ansi RTL/Win32 functions by default.

That means the C/C++ code requires WriteConsoleOutput() to map to WriteConsoleOutputW(), not to WriteConsoleOutputA().  So, in your code, just call WriteConsoleOutputW() directly, then it should write your UnicodeChar values as expected.

That was totally it. Thank you!

Thaddy

  • Hero Member
  • *****
  • Posts: 9278
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #4 on: February 19, 2019, 08:06:49 am »
As a side note : {$mode delphiunicode} will switch to the Windows Unicode API (W) automatically, so WriteConsoleOutput will map to WriteConsoleOutputW in that mode.
also related to equus asinus.

PascalDragon

  • Hero Member
  • *****
  • Posts: 705
  • Compiler Developer
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #5 on: February 19, 2019, 09:21:13 am »
No, it won't. Stop spreading such nonsense.  >:( It depends on whether the Windows unit is compiled with FPC_OS_UNICODE is set or not. Aside from that changing the mode would not affect a compiled unit as modes and modeswitches are specific to the units its used in (and those passed on the command line only apply to those that are compiled)

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7590
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #6 on: February 19, 2019, 09:44:39 am »

That would make perfect sense if WriteConsoleOutput() maps to WriteConsoleOutputA() instead of to WriteConsoleOutputW(). Which is likely the case given that FreePascal still uses Ansi strings and Ansi RTL/Win32 functions by default.

(that latter is not true, the core RTL uses W functions since 3.0, and most functions are overloaded ansistring/unicodestring, things like \\?\ escapes therefore also work)

Thaddy

  • Hero Member
  • *****
  • Posts: 9278
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #7 on: February 19, 2019, 11:47:36 am »
No, it won't. Stop spreading such nonsense.  >:( It depends on whether the Windows unit is compiled with FPC_OS_UNICODE is set or not. Aside from that changing the mode would not affect a compiled unit as modes and modeswitches are specific to the units its used in (and those passed on the command line only apply to those that are compiled)
I tested it! It is not nonsense.  Although I might have used a full unicode version, that's true. I might also have the sourcecode in the path anyway.
« Last Edit: February 19, 2019, 11:57:12 am by Thaddy »
also related to equus asinus.

ASerge

  • Hero Member
  • *****
  • Posts: 1419
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #8 on: February 19, 2019, 02:22:33 pm »
(that latter is not true, the core RTL uses W functions since 3.0, and most functions are overloaded ansistring/unicodestring, things like \\?\ escapes therefore also work)
Things like \\?\ works well in ANSI and Unicode. Internal "A" functions simply converts strings to Unicode and then calls "W" functions.

Thaddy

  • Hero Member
  • *****
  • Posts: 9278
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #9 on: February 19, 2019, 03:41:13 pm »
Yes. The A's are - mostly - proxies to W's most of the time since  at least WIN2000 on the WINAPI level. That doesn't mean they always work the same.
also related to equus asinus.

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 680
    • Lebeau Software
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #10 on: February 19, 2019, 07:01:38 pm »
Things like \\?\ works well in ANSI and Unicode.

Not always.  Some APIs support them only in the Unicode versions and not in the ANSI versions.

Internal "A" functions simply converts strings to Unicode and then calls "W" functions.

Usually, yes.  But there are cases where they are not simple proxies like that.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

ASerge

  • Hero Member
  • *****
  • Posts: 1419
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #11 on: February 19, 2019, 07:09:34 pm »
Things like \\?\ works well in ANSI and Unicode.
Not always.  Some APIs support them only in the Unicode versions and not in the ANSI versions.
Do not understand the logic. I can also say that some APIs do not work in old Windows, some only on servers. I'm talking about the \\?\., which work in ANSI as well. Of course, if there is no function, then there is nothing to perform.

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 680
    • Lebeau Software
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #12 on: February 21, 2019, 10:47:58 pm »
I'm talking about the \\?\., which work in ANSI as well. Of course, if there is no function, then there is nothing to perform.

Prepending "\\?\" to a path only works in Unicode functions, and even then only the functions that are documented to support it.  Whether or not the corresponding ANSI functions pass such a path string to their Unicode counterparts is handled on a per-function basic, and there are ANSI functions that do not do this.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

ASerge

  • Hero Member
  • *****
  • Posts: 1419
Re: WriteConsoleOutput ignores high order byte in UnicodeChar buffer?
« Reply #13 on: February 22, 2019, 09:23:30 pm »
Prepending "\\?\" to a path only works in Unicode functions, and even then only the functions that are documented to support it.  Whether or not the corresponding ANSI functions pass such a path string to their Unicode counterparts is handled on a per-function basic, and there are ANSI functions that do not do this.
No! For example from https://docs.microsoft.com/en-us/windows/desktop/LearnWin32/working-with-strings "Internally, the ANSI version translates the string to Unicode".
So if the function has ANSI and UNICODE versions and the UNICODE version supports the syntax '\\?\', then obviously (include logic, colleague) that the ANSI version will also support it.
Test it! It's work starting from Windows 2000.
Of course there is only ANSI and only Unicode version, there behavior is incomparable.
Get example, where UNICODE function with '\\?\' work, but ANSI not work.