Recent

Author Topic: The warning is confusing! Please help  (Read 357 times)

EganSolo

  • Sr. Member
  • ****
  • Posts: 395
The warning is confusing! Please help
« on: November 01, 2025, 08:59:44 pm »
I have to process text that contains non-ASCII characters. I understand that the String type in Free Pascal is keyed to UTF-8 by default. I don’t require Delphi compatibility, so mode Delphi isn’t essential for me.

Consider this bit of code:

Code: Pascal  [Select][+][-]
  1. program UniCode;
  2. {$H+}{$codepage utf8}
  3. uses Classes, SysUtils, LazUTF8;
  4.  
  5. function IsNonAscii(const S : String; const aPos: integer): Boolean;
  6. const NonAsciiChar = 'é';
  7. //comment the line above and uncomment the line below to get rid of the warning:
  8. //const NonAsciiChar : String = 'é';
  9.  
  10. var c : String;
  11. begin
  12.   c := UTF8Copy(s,apos,1);
  13.   Result := c = NonAsciiChar;
  14. end;
  15.  
  16. begin
  17.  
  18. end.
  19.  
  20.  

When I compile this code, the compiler issues the following warning: UniCode.lpr(12,13) Warning: Implicit string type conversion from “AnsiString” to “WideString”. However, if I switch to the typed constant, the warning disappears.

I don’t get it: even though I’m specifying $codepage utf8, why is the compiler defaulting string constants to WideString instead of String? Is there a way to switch that?


When I compile this, I get

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 12033
  • Debugger - SynEdit - and more
    • wiki
Re: The warning is confusing! Please help
« Reply #1 on: November 01, 2025, 09:17:43 pm »
The codepage directive afaik does not control what type your constant gets.

It controls, if the bytes in your sourcefile are converted, and what from.

See the example.

Code: Pascal  [Select][+][-]
  1. program P1;
  2.  {$Codepage cp1250}
  3.  { $codepage utf8}
  4. type U8 = type AnsiString(CP_UTF8);
  5. var s: U8;
  6. begin
  7.   s := #$B1#$B1;
  8.   writeln(s);
  9. end.

The string is always utf8.

But your source is not, with that directive, so the $B1 (same if you actually had a (or two) single char with that ordinal value) is converted from the codepage CP1250.  And it prints
Code: Text  [Select][+][-]
  1. ±±

But if you said that the source was utf8 already, then the $B1 is invalid utf8, and print ??




Btw
Code: Pascal  [Select][+][-]
  1. const NonAsciiChar = string('é');

 

TinyPortal © 2005-2018