Recent

Author Topic: Wrong utf8 chars in Windows 11  (Read 4636 times)

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Wrong utf8 chars in Windows 11
« on: October 25, 2021, 01:37:05 pm »
I need help, on windows xp,7,10 all is ok but on windows 11 I'm getting wrong utf8 chars in my app.
I've my strings in dll and I import my strings with LoadString function.
Code: Pascal  [Select][+][-]
  1.    buffer : array[0..512] of char;
  2.    res: string;
  3.  
  4.    if langDLL = dynlibs.NilHandle then exit;
  5.    k:=LoadString(langDLL,i,buffer,sizeof(buffer));
  6.    SetString(Res,buffer,k);
  7.  

can you help with this problem ?

AlexTP

  • Hero Member
  • *****
  • Posts: 2386
    • UVviewsoft
Re: Wrong utf8 chars in Windows 11
« Reply #1 on: October 25, 2021, 01:55:08 pm »
Can you provide the compilable project?

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Wrong utf8 chars in Windows 11
« Reply #2 on: October 25, 2021, 02:55:30 pm »
May be your resources are stored in Unicode?
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Re: Wrong utf8 chars in Windows 11
« Reply #3 on: October 25, 2021, 03:14:15 pm »
May be your resources are stored in Unicode?
they are saved same as the project in utf8.  And strings from project source are ok, only from my dll resource. But on windows < 11 all is ok. So something was changed in windows user32 for win11 ??

function LoadString(hInstance:HINST; uID:UINT; lpBuffer:LPSTR; nBufferMax:longint):longint; external 'user32' name 'LoadStringA';

wp

  • Hero Member
  • *****
  • Posts: 11855
Re: Wrong utf8 chars in Windows 11
« Reply #4 on: October 25, 2021, 04:08:12 pm »
It is working for me when I create the res file by specifying the codepage of my system:
Code: [Select]
windres.exe -c 1252 -i .\texts.rc -o .\texts.res
Never worked with these resource utils...

https://sourceware.org/binutils/docs/binutils/windres.html says:
Quote
-c val
--codepage val

    Specify the default codepage to use when reading an rc file.
Since the rc file in UTF8 I'd have expected that the codepage should be specified as 65001 (UTF8). Strange...

Maybe it would be clearer if there were a "W" version of LoadString function.

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Re: Wrong utf8 chars in Windows 11
« Reply #5 on: October 25, 2021, 04:12:31 pm »
It is working for me when I create the res file by specifying the codepage of my system:
Code: [Select]
windres.exe -c 1252 -i .\texts.rc -o .\texts.res
Never worked with these resource utils...

https://sourceware.org/binutils/docs/binutils/windres.html says:
Quote
-c val
--codepage val

    Specify the default codepage to use when reading an rc file.
Since the rc file in UTF8 I'd have expected that the codepage should be specified as 65001 (UTF8). Strange...

Maybe it would be clearer if there were a "W" version of LoadString function.
omg, thx, it works on win11 with 1252 here. But file is utf8...

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Re: Wrong utf8 chars in Windows 11
« Reply #6 on: October 25, 2021, 04:14:07 pm »
But now there is problem on windows 10...

AlexTP

  • Hero Member
  • *****
  • Posts: 2386
    • UVviewsoft
Re: Wrong utf8 chars in Windows 11
« Reply #7 on: October 25, 2021, 04:15:25 pm »
>Maybe it would be clearer if there were a "W" version of LoadString function.
Can you use LoadStringW instead?

wp

  • Hero Member
  • *****
  • Posts: 11855
Re: Wrong utf8 chars in Windows 11
« Reply #8 on: October 25, 2021, 04:26:00 pm »
But now there is problem on windows 10...
I'm afraid, doing the Win 11 upgrade yesterday was more destructive than what I thought...

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Re: Wrong utf8 chars in Windows 11
« Reply #9 on: October 25, 2021, 06:40:38 pm »
I just simple change to LoadStringW and nothing.
It seems that the problem is with resources (windres<->win11).
At least I can temporarily provide a patch dedicated for win11 for my application with -c 1252.
However, I completely don't understand why 1252 when my system have 1250.
magic...

wp

  • Hero Member
  • *****
  • Posts: 11855
Re: Wrong utf8 chars in Windows 11
« Reply #10 on: October 25, 2021, 06:50:36 pm »
Did you try creating the res file with lazres (in folder tools of your Lazarus installation)? I use it a lot for the palette images and other image resources, but did not yet use it for text resources.

There's also an fpcres in the fpc/bin folder.

But I'd first give lazres a try because it originates in the UTF8 world of Lazarus...

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Wrong utf8 chars in Windows 11
« Reply #11 on: October 26, 2021, 08:42:26 am »
Problem is most likely with triple codepage conversion. Fist you have your rc file, that can have different codepages. I guess, resources should be stored in UTF-16. So, conversion is made. Then you call LoadStringA, where they're converted back to Ansi. And then they're converted from Ansi to UTF-8. No wonder, that something can go wrong. Some "default" codepage settings are used at some point, that seem to be different on Win10 and Win11.

Try using LoadStringW, but use explicit UTF8Encode. PChars are tricky. Automatic codepage conversion doesn't work for them.
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

bigeno

  • Sr. Member
  • ****
  • Posts: 266
Re: Wrong utf8 chars in Windows 11
« Reply #12 on: October 26, 2021, 02:13:06 pm »
Problem is most likely with triple codepage conversion. Fist you have your rc file, that can have different codepages. I guess, resources should be stored in UTF-16. So, conversion is made. Then you call LoadStringA, where they're converted back to Ansi. And then they're converted from Ansi to UTF-8. No wonder, that something can go wrong. Some "default" codepage settings are used at some point, that seem to be different on Win10 and Win11.

Try using LoadStringW, but use explicit UTF8Encode. PChars are tricky. Automatic codepage conversion doesn't work for them.
it looks like you're right. I'll have to check it out but I think I can go the "no-encoding way", for example "#e" means ę, etc. That would be the best in that case.

 

TinyPortal © 2005-2018