Recent

Author Topic: concatenate two ansistring in console application  (Read 2125 times)

kjteng

  • Sr. Member
  • ****
  • Posts: 259
concatenate two ansistring in console application
« on: March 03, 2024, 02:29:44 pm »
In the following program, I am able to display S1, S2 correctly but failed to display S3 (which is S1+S2).  However, the same works if it is a normal window GUI application. Hope someone can help to point out my mistake. (note: debugger watch window shows S3= ????, so I think there is no problem with writeln statement)

Code: Pascal  [Select][+][-]
  1. program concat1;
  2.  
  3. uses windows;
  4. {$mode objfpc}{$H+}
  5.  
  6. {$R *.res}
  7.  
  8. var u1, u2: unicodestring;
  9.     s1, s2, s3: AnsiString;
  10.  
  11.  
  12. begin
  13.   SetConsoleOutputCP(CP_utf8);
  14.   u1 := utf8Decode('春夏');
  15.   u2 := utf8Decode('秋冬');
  16.   s1 := utf8Encode(u1);
  17.   s2 := utf8Encode(u2);
  18.   writeln(pchar(s1), #13#10, pchar(s2));
  19.   s3 := s1 + s2;  // expect '春夏秋冬'' but got '????' in console apps
  20.   writeln(pchar(s3));
  21.   readln;
  22. end.    

Attached is the gui version (which display s3 correctly)

cdbc

  • Hero Member
  • *****
  • Posts: 1673
    • http://www.cdbc.dk
Re: concatenate two ansistring in console application
« Reply #1 on: March 03, 2024, 03:35:18 pm »
Hi
In your console app, you could try this:
Code: Pascal  [Select][+][-]
  1. program concat1;
  2. {$mode objfpc}{$H+} { MODE COMES HERE }
  3. {$Codepage UTF8}   { make the sourcecode utf8 }
  4.  
  5. uses windows;
  6. { NOT HERE }
  7. {$R *.res}
  8.  
  9. var u1, u2: unicodestring;
  10.     s1, s2, s3: AnsiString;
  11.   ...
  12.  
Mode changed place and the codepage modifier lets you use unicode literals in source...
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

Thaddy

  • Hero Member
  • *****
  • Posts: 16199
  • Censorship about opinions does not belong here.
Re: concatenate two ansistring in console application
« Reply #2 on: March 03, 2024, 03:50:39 pm »
Benny, it only makes the string literals unicode. Not the sourcecode.
If I smell bad code it usually is bad code and that includes my own code.

kjteng

  • Sr. Member
  • ****
  • Posts: 259
Re: concatenate two ansistring in console application
« Reply #3 on: March 03, 2024, 03:58:49 pm »
thanks for the reply but that doesnot solve the problem. if i m not wrong, fpk source n string literal is by default utf8 encoded.
p/s i thought the string literal is by default utf8 encoded in this mode?
« Last Edit: March 03, 2024, 04:03:18 pm by kjteng »

wp

  • Hero Member
  • *****
  • Posts: 12474
Re: concatenate two ansistring in console application
« Reply #4 on: March 03, 2024, 04:20:45 pm »
UTF8 in Windows console applications is still one of the greatest mysterious to me...

This is working (Win 11, Laz/main+FPC3.2.2): Add LazUtils to the project requirements and add LazUTF8 to the uses clause in order to get the UTF8 widestring manager. Then compile this code:
Code: Pascal  [Select][+][-]
  1. program concat1;
  2. {$mode objfpc}{$H+}
  3. uses
  4.   LazUTF8;
  5. var
  6.   s1, s2, s3: AnsiString;
  7. begin
  8.   s1 := '春夏';
  9.   s2 := '秋冬';
  10.   writeln(s1, #13#10, s2);
  11.   s3 := s1 + s2;
  12.   writeln(s3);
  13.   readln;
  14. end;
Depending on the codepage selected, the utf8 strings will not be displayed correctly in the IDE, but when you open a separate console window, type "chcp 65001" to change the codepage to utf8 and then run the application, it will be ok. SetConsoleOutput() has no effect on my system...
« Last Edit: March 03, 2024, 04:25:51 pm by wp »

jamie

  • Hero Member
  • *****
  • Posts: 6735
Re: concatenate two ansistring in console application
« Reply #5 on: March 03, 2024, 04:25:03 pm »
change the "AnsiString" to a "RawByteString"


or

Utf8String;
The only true wisdom is knowing you know nothing

cdbc

  • Hero Member
  • *****
  • Posts: 1673
    • http://www.cdbc.dk
Re: concatenate two ansistring in console application
« Reply #6 on: March 03, 2024, 04:28:18 pm »
Hi
Thaddy, that was sort of what I meant by:
Quote
the codepage modifier lets you use unicode literals in source...
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

tetrastes

  • Hero Member
  • *****
  • Posts: 600
Re: concatenate two ansistring in console application
« Reply #7 on: March 03, 2024, 10:01:24 pm »
SetConsoleOutput() has no effect on my system...

Try this
Code: Pascal  [Select][+][-]
  1. program concat1;
  2. {$mode objfpc}{$H+}
  3. uses
  4.   LazUTF8,
  5.   windows;
  6. var
  7.   s1, s2, s3: AnsiString;
  8. begin
  9.   SetConsoleOutputCP(CP_UTF8);
  10.   s1 := '春夏';
  11.   s2 := '秋冬';
  12.   writeln(s1, #13#10, s2);
  13.   writeln(pchar(s1), #13#10, pchar(s2));
  14.   s3 := s1 + s2;
  15.   writeln(s3);
  16.   writeln(pchar(s3));
  17.   readln;
  18. end.
with different CPs in SetConsoleOutputCP, and with and without LazUTF8.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: concatenate two ansistring in console application
« Reply #8 on: March 03, 2024, 10:08:08 pm »
In Lazarus go to settings->application , enable manifest and utf-8 codepage.

wp

  • Hero Member
  • *****
  • Posts: 12474
Re: concatenate two ansistring in console application
« Reply #9 on: March 03, 2024, 10:24:39 pm »
Try this [...] with different CPs in SetConsoleOutputCP, and with and without LazUTF8.
A very non-Pascal-ish way to use WriteLn...

In Lazarus go to settings->application , enable manifest and utf-8 codepage.
Ah, the code in the first post does show the utf8 characters when the app runs in the IDE if
- that utf8-manifest is active
- SetConsoleOutputCP has been set to UTF8.
But still no UTF8 when "ordinary" Pascal strings are in the WriteLn. For this, I have to run the exe separately in a console window for which I had changed codepage manually to 65001.
« Last Edit: March 03, 2024, 10:38:40 pm by wp »

tetrastes

  • Hero Member
  • *****
  • Posts: 600
Re: concatenate two ansistring in console application
« Reply #10 on: March 03, 2024, 10:46:03 pm »
Try this [...] with different CPs in SetConsoleOutputCP, and with and without LazUTF8.
A very non-Pascal-ish way to use WriteLn...

You mean using pchar(s)? But thus we can output utf8 without forcing users to chcp...
EDIT: Though may be better to use shortstrings...
« Last Edit: March 03, 2024, 10:54:54 pm by tetrastes »

Bart

  • Hero Member
  • *****
  • Posts: 5469
    • Bart en Mariska's Webstek
Re: concatenate two ansistring in console application
« Reply #11 on: March 03, 2024, 10:54:03 pm »
I found this, maybe helps:
Code: Pascal  [Select][+][-]
  1. unit setdefaultcodepages;
  2.  
  3. interface
  4.  
  5. uses
  6.   Windows;
  7.  
  8. implementation
  9.  
  10. Const
  11.   LF_FACESIZE = 32;
  12.  
  13. Type
  14.   CONSOLE_FONT_INFOEX = record
  15.     cbSize      : ULONG;
  16.     nFont       : DWORD;
  17.     dwFontSizeX : SHORT;
  18.     dwFontSizeY : SHORT;
  19.     FontFamily  : UINT;
  20.     FontWeight  : UINT;
  21.     FaceName    : array [0..LF_FACESIZE-1] of WCHAR;
  22.   end;
  23.  
  24. { Only supported in Vista and onwards!}
  25.  
  26. function SetCurrentConsoleFontEx(hConsoleOutput: HANDLE; bMaximumWindow: BOOL; var CONSOLE_FONT_INFOEX): BOOL; stdcall; external 'kernel32.dll' name 'SetCurrentConsoleFontEx';
  27. var
  28.   New_CONSOLE_FONT_INFOEX : CONSOLE_FONT_INFOEX;
  29.  
  30. initialization
  31.   writeln('SetDefaultCodepages unit initialization: DefaultSystemCodePage = ',DefaultSystemCodePage);
  32.   {$ifdef DisableUTF8RTL}
  33.   SetConsoleOutputCP(DefaultSystemCodePage);
  34.   SetTextCodepage(Output, DefaultSystemCodePage);
  35.   {$else}
  36.   SetConsoleOutputCP(cp_utf8);
  37.   SetTextCodepage(Output, cp_utf8);
  38.   {$endif}
  39.  
  40.   FillChar(New_CONSOLE_FONT_INFOEX, SizeOf(CONSOLE_FONT_INFOEX), 0);
  41.   New_CONSOLE_FONT_INFOEX.cbSize := SizeOf(CONSOLE_FONT_INFOEX);
  42. //  New_CONSOLE_FONT_INFOEX.FaceName := 'Lucida Console';
  43.   New_CONSOLE_FONT_INFOEX.FaceName := 'Consolas';
  44.   New_CONSOLE_FONT_INFOEX.dwFontSizeX := 8;
  45.   New_CONSOLE_FONT_INFOEX.dwFontSizeY := 16;
  46.  
  47.   SetCurrentConsoleFontEx(StdOutputHandle, False, New_CONSOLE_FONT_INFOEX);
  48. end.

I rember being able to output all sorts of UTF-8 strings on the console using this.

Bart

wp

  • Hero Member
  • *****
  • Posts: 12474
Re: concatenate two ansistring in console application
« Reply #12 on: March 03, 2024, 10:56:27 pm »
Try this [...] with different CPs in SetConsoleOutputCP, and with and without LazUTF8.
A very non-Pascal-ish way to use WriteLn...

You mean using pchar(s)? But thus we can output utf8 without forcing users to chcp...
Yes, certainly. But why can't WriteLn do it by itself without forcing me to type-cast the string to PChar?

wp

  • Hero Member
  • *****
  • Posts: 12474
Re: concatenate two ansistring in console application
« Reply #13 on: March 03, 2024, 11:03:14 pm »
Ah, SetTextCodePage was missing - that's the solution. Thanks, Bart.
 
Code: Pascal  [Select][+][-]
  1. program utf8_console;
  2. {$mode objfpc}{$H+}
  3. {$R *.res}   // Do not remove. And check "Use manifest resouce" and "ANSI codepage is UTF8" in project options
  4. uses
  5.   windows;
  6. var
  7.   s1, s2, s3: String;
  8. begin
  9.   SetConsoleOutputCP(CP_utf8);          // important
  10.   SetTextCodepage(Output, cp_utf8);     // important
  11.  
  12.   s1 := '春夏';
  13.   s2 :='秋冬';
  14.   s3 := s1 + s2;
  15.   WriteLn(s1, ' ', s2, ' ' , s3);
  16.  
  17.   Readln;
  18. end.

Output (with code page of console window set to 1252):
Code: Pascal  [Select][+][-]
  1. 春夏 秋冬 春夏秋冬
« Last Edit: March 03, 2024, 11:49:23 pm by wp »

Bart

  • Hero Member
  • *****
  • Posts: 5469
    • Bart en Mariska's Webstek
Re: concatenate two ansistring in console application
« Reply #14 on: March 03, 2024, 11:49:56 pm »
Ah, SetTextCodePage was missing - that's the solution. Thanks, Bart.

It's not my code.
It was posted here on the forum some years ago.

Bart

 

TinyPortal © 2005-2018