Recent

Author Topic: Extended ASCII use - 2  (Read 12286 times)

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Extended ASCII use - 2
« Reply #15 on: January 11, 2022, 09:36:38 pm »
Code: Pascal  [Select][+][-]
  1.       c:=char(i);   // ! This is invalid in utf8 and leads to runtime error 101
  2.  

What would be the correct way considering:
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}{$H+}
  2. {$Codepage cp850}
  3.  
  4. var
  5. ..
  6.   i:byte;
  7.   c:char;

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #16 on: January 11, 2022, 09:52:31 pm »
In an ideal world it should return 4, imho.  >:(
By the way, at linux
Code: C  [Select][+][-]
  1. sizeof(wchar_t)
returns 4.  :-X

Thaddy

  • Hero Member
  • *****
  • Posts: 14373
  • Sensorship about opinions does not belong here.
Re: Extended ASCII use - 2
« Reply #17 on: January 11, 2022, 10:00:42 pm »
Yes.
Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #18 on: January 11, 2022, 10:06:28 pm »
@engkin
To imitate CP850 at UTF8 console without errors:
Code: Pascal  [Select][+][-]
  1. program project1;
  2.  
  3. {$mode objfpc}{$H+}
  4. {$Codepage cp850}
  5.  
  6. {$IFDEF UNIX}
  7. uses cwstring;    // We need properly working UnicodeManager, of course
  8. {$ENDIF}
  9.  
  10. type
  11.   CP850String = type AnsiString(850);
  12.  
  13. var
  14.   s:array[0..3] of String;
  15.   ss:string;
  16.   i:byte;
  17.   c:char;
  18. begin
  19.   WriteLn('DefaultSystemCodePage: ',DefaultSystemCodePage);
  20.   WriteLn('TextRec(Output).CodePage: ',TextRec(Output).CodePage);
  21.   ss:='ÄÇýÝ';
  22.   WriteLn(ss);
  23.  
  24.   //CP850: #$80..#$FF
  25.   s[0]:='ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜø£Ø׃';
  26.   s[1]:='áíóúñѪº¿®¬½¼¡«»░▒▓│┤ÁÂÀ©╣║╗╝¢¥┐';
  27.   s[2]:='└┴┬├─┼ãÃ╚╔╩╦╠═╬¤ðÐÊËÈıÍÎÏ┘┌█▄¦Ì▀';
  28.   s[3]:='ÓßÔÒõÕµþÞÚÛÙýݯ´­±‗¾¶§÷¸°¨·¹³²■';
  29.   for i:=$80 to $FF do
  30.   begin
  31.     if (i mod 32)=0 then
  32.     begin
  33.       WriteLn();
  34.       WriteLn();
  35.       WriteLn(s[i div 32 - 4]);
  36.     end;
  37.     if i in [$07,$08,$09,$0A,$0D] then
  38.       c:=' '
  39.     else
  40.       c:=char(i);
  41.     Write(CP850String(c));    // SIC!
  42.   end;
  43. end.
  44.  

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: Extended ASCII use - 2
« Reply #19 on: January 11, 2022, 10:38:03 pm »
In an ideal world it should return 4, imho.  >:(
Totally agree. Go Linux!

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Extended ASCII use - 2
« Reply #20 on: January 11, 2022, 10:55:11 pm »
@tetrastes,

If chcp gave 850, what's the difference between:
Code: Pascal  [Select][+][-]
  1.     Write(c);


and

Code: Pascal  [Select][+][-]
  1.     Write(CP850String(c));

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #21 on: January 11, 2022, 11:24:16 pm »
Casting char to string? The output is the same.
It's funny that at windows there is difference between
Code: Pascal  [Select][+][-]
  1. write(c)
which outputs in console CP, and
Code: Pascal  [Select][+][-]
  1. write(string(c))
which outputs in default system CP.
And you may change them independently.  %)

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Extended ASCII use - 2
« Reply #22 on: January 11, 2022, 11:37:44 pm »
Casting char to string? The output is the same.
It's funny that at windows there is difference between
Code: Pascal  [Select][+][-]
  1. write(c)
which outputs in console CP, and

Right, and that is what raymond wants.

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #23 on: January 11, 2022, 11:48:12 pm »
But he wants it at linux with UTF8, where write(c) works only for c<128.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Extended ASCII use - 2
« Reply #24 on: January 12, 2022, 12:06:12 am »
But he wants it at linux with UTF8, where write(c) works only for c<128.

In this case you are right, I missed that part. Thanks for correcting me.

Thaddy

  • Hero Member
  • *****
  • Posts: 14373
  • Sensorship about opinions does not belong here.
Re: Extended ASCII use - 2
« Reply #25 on: January 12, 2022, 07:24:57 am »
Anybody knows why UnicodeChar differs from wchar_t ? (or similar in any other unicode supporting language!)
Currently it is just UCS2 size. It really should be size 4.
So OK on Delphi it is also size 2, I guess.
I suspect that a size 4 greatly simplifies handling codepoints.
« Last Edit: January 12, 2022, 07:29:53 am by Thaddy »
Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #26 on: January 12, 2022, 08:54:11 am »
Not only on Delphi, but windows itself.
Code: C  [Select][+][-]
  1. sizeof(wchar_t)
is 2 at windows, 32 or 64 bit, gcc or cl.
So fpc loves windows.  ::)

PascalDragon

  • Hero Member
  • *****
  • Posts: 5479
  • Compiler Developer
Re: Extended ASCII use - 2
« Reply #27 on: January 12, 2022, 09:03:54 am »
Anybody knows why UnicodeChar differs from wchar_t ? (or similar in any other unicode supporting language!)
Currently it is just UCS2 size. It really should be size 4.
So OK on Delphi it is also size 2, I guess.
I suspect that a size 4 greatly simplifies handling codepoints.

Because Delphi defines UnicodeChar to be used with UTF-16 which has 2 Byte codepoints and Delphi in turn uses UTF-16, because Windows uses UTF-16. For UTF-32 there is UCS4Char (though this is not a true, builtin character type, but simply an alias to LongWord).

Not only on Delphi, but windows itself.
Code: C  [Select][+][-]
  1. sizeof(wchar_t)
is 2 at windows, 32 or 64 bit, gcc or cl.
So fpc loves windows.  ::)

No, for FPC Delphi-compatibility is the main driver and thus UnicodeChar is declared the same as it is in Delphi which - as written above - uses UTF-16 (and thus 2 Byte codepoints), because Windows uses UTF-16.

tetrastes

  • Sr. Member
  • ****
  • Posts: 481
Re: Extended ASCII use - 2
« Reply #28 on: January 12, 2022, 09:14:45 am »
FPC follows Delphi, Delphi follows Windows... The leading sheep is Windows, anyway.  :D
« Last Edit: January 12, 2022, 09:31:26 am by tetrastes »

PascalDragon

  • Hero Member
  • *****
  • Posts: 5479
  • Compiler Developer
Re: Extended ASCII use - 2
« Reply #29 on: January 12, 2022, 09:41:52 am »
Even if Delphi or FPC would use a 4 Byte UnicodeChar there'd still be a need for a 2 Byte character to handle UTF-16. It's after all a reason why C++ introduced distinct char16_t and char32_t types in C++11 (see here).

 

TinyPortal © 2005-2018