Recent

Author Topic: Unexpected PCHAR behavior (bug ?)  (Read 3816 times)

440bx

  • Hero Member
  • *****
  • Posts: 3944
Unexpected PCHAR behavior (bug ?)
« on: January 15, 2020, 04:17:52 am »
Hello,

It seems that FPC does automatic dereferencing even when it has been explicitly told not to.  Consider the following code:
Code: Pascal  [Select][+][-]
  1. {$MODESWITCH     ANSISTRINGS-      }
  2.  
  3.  
  4.  
  5. { note the use of NO AUTOMATIC DEREFERENCING }
  6.  
  7. {$MODESWITCH     AUTODEREF-        }
  8.  
  9.  
  10.  
  11. {$MODESWITCH     UNICODESTRINGS-   }
  12. {$MODESWITCH     POINTERTOPROCVAR- }
  13.  
  14. {$LONGSTRINGS    OFF               }
  15. {$WRITEABLECONST OFF               }
  16. {$TYPEDADDRESS   ON                }
  17.  
  18. program _AtSignInCase;
  19.  
  20. var
  21.   a : packed array[0..15] of char = '1234567890/abcd';
  22.  
  23.   p : pchar;
  24.  
  25.   i : integer;
  26.  
  27. begin
  28.   // this works as it should
  29.  
  30.   p := @a[low(a)];
  31.   while true do
  32.   begin
  33.  
  34.     if p^ = '/' then break; // dereference p and compare to char, all good.
  35.     inc(p);                 // look at next character
  36.   end;
  37.  
  38.  
  39.   i := low(a);
  40.   p := @a[low(a)];
  41.   while true do
  42.   begin
  43.  
  44.     // this doesn't work and no error is emitted
  45.     // for the statement:
  46.     //
  47.     // if p = '/' then break; <-- no type error !?
  48.     //
  49.     // and no break on the '/' either!
  50.     //
  51.  
  52.     if p = '/' then break;  // no type error !?
  53.     inc(p);
  54.  
  55.     // the following is just to avoid
  56.     // an infinite loop
  57.  
  58.     inc(i);
  59.     if i > high(a) then
  60.     begin
  61.       writeln('broke loop on value of i instead of p');
  62.       writeln('value of (p - 2)^ is : ', (p - 2)^);
  63.       break;
  64.     end;
  65.   end;
  66.  
  67.   writeln;
  68.   writeln('press enter/return to end this program');
  69.   readln;
  70. end.
On line 52 the pointer to char "p" is being compared against a character.  It seems that should cause the compiler to state that a pointer cannot be compared to a character but, it accepts it in spite of the compiler directive AUTODEREF being set to off.

The result is, where it not for the additional variable i (added specifically for the sake of this example), the while loop would be an infinite loop.

Shouldn't the compiler have emitted a type error message for line 52 ?

Thanks.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: Unexpected PCHAR behavior (bug ?)
« Reply #1 on: January 15, 2020, 07:38:03 pm »
Shouldn't the compiler have emitted a type error message for line 52 ?
Most likely, there are hardly any options that allow you to generate an error message in this case. As far as I know, string constants are compatible with any string type. Based on the directives specified in the example, the compiler converts the character pointer to a short string and then compares it to a constant.

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Unexpected PCHAR behavior (bug ?)
« Reply #2 on: January 15, 2020, 09:42:58 pm »
<snip> the compiler converts the character pointer to a short string and then compares it to a constant.
You're right, that's what it does but, it doesn't seem right for the compiler to be doing that, particularly when it's been told not to do automatic dereferencing.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 1312
    • Lebeau Software
Re: Unexpected PCHAR behavior (bug ?)
« Reply #3 on: January 16, 2020, 03:20:50 am »
You're right, that's what it does but, it doesn't seem right for the compiler to be doing that, particularly when it's been told not to do automatic dereferencing.

There is no dereferencing happening in the second case.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Unexpected PCHAR behavior (bug ?)
« Reply #4 on: January 16, 2020, 05:08:25 am »
There is no dereferencing happening in the second case.
Not in code but, logically it is dereferencing, otherwise there would be a type error, i.e, a pointer cannot be compared to a character (type mismatch) but, a dereferenced pointer to char can be compared to a char.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: Unexpected PCHAR behavior (bug ?)
« Reply #5 on: January 16, 2020, 12:25:44 pm »
My understanding:
It is not dereferencing, but converting.

Code: Pascal  [Select][+][-]
  1. procedure Foo(s: String);
  2. ...
  3. var p: PChar;
  4. Foo(p);

"p" will not be dereferenced. If it was, it would pass a single char too Foo. But it passes an entire (null terminated) string.

Whenever a pchar is used, where a string would be required, it is converted.

So in your comparison, it is converted too. Probably lucky that it does not cause a crash, as your array of char has no null termination.

This also explains why it is never equal to '/'.
Because it is a string starting with '/abcd', continuing to a null byte.
« Last Edit: January 16, 2020, 12:58:40 pm by Martin_fr »

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Unexpected PCHAR behavior (bug ?)
« Reply #6 on: January 16, 2020, 01:56:06 pm »
My understanding:
It is not dereferencing, but converting.
Yes, that's correct.  The code is converting the "pointer to array of characters" into a shortstring with the same characters.

"p" will not be dereferenced. If it was, it would pass a single char too Foo. But it passes an entire (null terminated) string.
Yes, that's correct too.

Whenever a pchar is used, where a string would be required, it is converted.
That is what leads to "strange things".  In a strongly typed language a "pointer to char" cannot be compared to a "char". The types don't match (C++ won't accept it and quite likely C won't either.)  The fact that such a comparison works gives the impression that either the "pointer to char" is being automatically dereferenced (which would make the types match) or, what C/C++ do, compare the value of the pointer to the address of the character constant (saved in a read-only section by the compiler) or, lastly what FPC is doing, converting a "pointer to char" into a shortstring and comparing the shortstrings - which is a strange thing to do since the compiler can't always know what p points to looks like (I guess it will just accumulate characters into a shortstring until it finds a null.)


Probably lucky that it does not cause a crash, as your array of char has no null termination.
In the example, I chose to add a counter to prevent going out of bounds but, your point is valid. I definitely wouldn't have the array nor the code that way in a real program.

This also explains why it is never equal to '/'.
Because it is a string starting with '/abcd', continuing to a null byte.
Actually, it's comparing '/' to '1234567890/abcd', as you stated, that will definitely never be equal.


I guess, that's just the way FPC does it.  It opens the door to unexpected problems when the programmer missed typing the "^", which is how I "bumped" into this situation.  That reminds me of C, one typo and good luck... you'll need it.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 1312
    • Lebeau Software
Re: Unexpected PCHAR behavior (bug ?)
« Reply #7 on: January 17, 2020, 07:11:28 pm »
a pointer cannot be compared to a character (type mismatch)

True, but it can be compared to another pointer.  I was actually expecting the compiler to treat the `p = '/'` statement as if it were `p = PChar('/')` since it should be less work for the compiler to treat the string literal as a PChar constant at compile-time than to convert `p` to a string at runtime.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Unexpected PCHAR behavior (bug ?)
« Reply #8 on: January 18, 2020, 01:59:55 am »
I was actually expecting the compiler to treat the `p = '/'` statement as if it were `p = PChar('/')` since it should be less work for the compiler to treat the string literal as a PChar constant at compile-time than to convert `p` to a string at runtime.
That would be reasonable and is "type legal".  In FPC, that behavior can be obtained if the constant is explicitly typecast to pchar just as you suggested.

Just for curiosity I tried it in MSVC and that's what it does, it compares p to the address of the constant "/" which it saved in a read-only section of the PE file.

Interestingly, as a result of doing conversion,  FPC will compile something like this:
Code: Pascal  [Select][+][-]
  1. var
  2.   c : char;
  3.   p : pchar;
  4. begin
  5.   if p = c then c := 'a';
  6. end;
  7.  
while C/C++ won't accept such a comparison.


In the meantime, omitting the "^" due to a typo yields results that are not particularly desirable. <chuckle>
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Unexpected PCHAR behavior (bug ?)
« Reply #9 on: January 18, 2020, 11:57:14 am »
The PChar type (or more general ^Char) is an exception to the rule due to its close relationship with strings. There are some internal conversions between strings, PChars and array of Char that aren't done for other types (mainly because Delphi does it this way as well).

If you look at the assembly code of the following example:

Code: Pascal  [Select][+][-]
  1. program tpchar;
  2.  
  3. var
  4.   p: PChar;
  5.   c: Char;
  6. begin
  7.   if p = 'a' then ;
  8.   if p = c then ;
  9. end.

Code: ASM  [Select][+][-]
  1. # [7] if p = 'a' then ;
  2.         movq    U_$P$TPCHAR_$$_P(%rip),%rax
  3.         leaq    -256(%rbp),%rcx
  4.         movq    $255,%rdx
  5.         movq    %rax,%r8
  6.         call    fpc_pchar_to_shortstr
  7.         leaq    -256(%rbp),%rcx
  8.         leaq    _$TPCHAR$_Ld1(%rip),%rdx
  9.         call    fpc_shortstr_compare_equal
  10.         testl   %eax,%eax
  11.         je      .Lj3
  12.         jmp     .Lj4
  13. .Lj4:
  14. .Lj3:
  15. # [8] if p = c then ;
  16.         movq    U_$P$TPCHAR_$$_P(%rip),%r8
  17.         leaq    -256(%rbp),%rcx
  18.         movq    $255,%rdx
  19.         call    fpc_pchar_to_shortstr
  20.         leaq    -256(%rbp),%rcx
  21.         movzbl  U_$P$TPCHAR_$$_C(%rip),%eax
  22.         shll    $8,%eax
  23.         orl     $1,%eax
  24.         movw    %ax,-512(%rbp)
  25.         leaq    -512(%rbp),%rdx
  26.         call    fpc_shortstr_compare_equal
  27.         testl   %eax,%eax
  28.         je      .Lj15
  29.         jmp     .Lj16
  30. .Lj16:
  31. .Lj15:

You'll notice that FPC inserts explicit conversions of the PChar to a (in this case) ShortString. This is simply how things are and won't be changed.

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Unexpected PCHAR behavior (bug ?)
« Reply #10 on: January 18, 2020, 01:48:34 pm »
(mainly because Delphi does it this way as well).
I figured that was the culprit.

You'll notice that FPC inserts explicit conversions of the PChar to a (in this case) ShortString. This is simply how things are and won't be changed.
I understand.  The problem has its origins in making pchars and array of chars, as well as other string types, "compatible".  At this point, it's quite likely that the only consistent way of "compatibilizing" pchars and the various string forms is to convert them as necessary to "force" compatibility (even when it isn't always desirable.)

Thank you for the additional clarification.

 
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018