Recent

Author Topic: [SOLVED]Difference between S[1] and PChar(S)^  (Read 2973 times)

tomitomy

  • Sr. Member
  • ****
  • Posts: 251
[SOLVED]Difference between S[1] and PChar(S)^
« on: April 09, 2021, 02:44:12 am »
I found a  bug when I debug a code, the following code can show the bug.

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. // Arch Linux,  Lazarus 2.0.12,  FPC 3.2.0
  4.  
  5. {$mode objfpc}{$H+}
  6.  
  7.  
  8. procedure WriteFile_1(var F: File; S: String);
  9. begin
  10.   BlockWrite(F, PChar(S)[0], Length(S));
  11. end;
  12.  
  13.  
  14. procedure WriteFile_2(var F: File; S: String);
  15. begin
  16.   BlockWrite(F, S[1], Length(S));
  17. end;
  18.  
  19.  
  20. var
  21.   F: File;
  22.   S: String;
  23.  
  24. begin
  25.   Assign(F, 'a.txt');
  26.   Rewrite(F);
  27.  
  28.   S := '';
  29.  
  30.   WriteFile_1(F, S);  // OK
  31.   WriteFile_2(F, S);  // RunError(201) in Debug mode, but OK in Default or Release mode
  32.  
  33.   Close(F);
  34. end.
« Last Edit: April 12, 2021, 03:58:02 am by tomitomy »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: A bug in Lazarus Debug mode
« Reply #1 on: April 09, 2021, 02:50:23 am »
201: range check

Code: Pascal  [Select][+][-]
  1. s:= '';
  2. ... s[1] // Yes that is a range check

tomitomy

  • Sr. Member
  • ****
  • Posts: 251
Re: A bug in Lazarus Debug mode
« Reply #2 on: April 09, 2021, 03:41:22 am »
201: range check

Code: Pascal  [Select][+][-]
  1. s:= '';
  2. ... s[1] // Yes that is a range check

The AnsiString is always ends with charactor #0,so I think it does not out of the string range, otherwise Debug mode and Default mode will have different rules.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: A bug in Lazarus Debug mode
« Reply #3 on: April 09, 2021, 04:10:23 am »
The #0 is not within range.

Also the empty string, technically is a nil pointer. So there is no #0.


The fact that it works without range checking => pure luck.
Change settings (like optimization) or use a diff fpc version, maybe it works, maybe it crashes, maybe it crashes in fpc 9.1 (i.e. far future).

BlockWrite takes an untyped param for "buf"
Code: Pascal  [Select][+][-]
  1. Procedure BlockWrite(var f:File;const Buf;Count:Longint);
That means, blockwrite takes the address of whatever your code puts there.

If fully evaluated, this would mean:
"s[1]"  first byte of the string => deref the internal pointer => crash on nil deref
"untyped param" take the address of the above
In other words
Code: Pascal  [Select][+][-]
  1. @( ptr^ )

Apparently fpc just shortcuts that (lucky). "@( ptr^ )" is "ptr".
Therefore there is no crash on the nil deref.

BlockWrite apparently does not deref, if the length is zero. But that again, is just luck.


tomitomy

  • Sr. Member
  • ****
  • Posts: 251
Re: A bug in Lazarus Debug mode
« Reply #4 on: April 09, 2021, 07:05:11 am »
The #0 is not within range.

Also the empty string, technically is a nil pointer. So there is no #0.
...

Thank you Martin_fr, I understand now.

I used to think 'S[1]' is equivalent to 'PChar(S)^', but now I understand that 'PChar(S)^' is safer than 'S[1]' because 'PChar(S)' is always ends with '#0'.

Code: Pascal  [Select][+][-]
  1. var
  2.   S: String = '';
  3.  
  4. begin
  5.   WriteLn( HexStr(Pointer(S)) );  // 0000000000000000
  6.   WriteLn( HexStr(PChar(S)) );    // 0000000000436380
  7.   WriteLn( Ord(PChar(S)^) );      // 0
  8. end.
« Last Edit: April 09, 2021, 04:28:46 pm by tomitomy »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #5 on: April 09, 2021, 02:30:29 pm »
Code: Pascal  [Select][+][-]
  1. s := '';
  2. p := Pchar(s)

P will be nil too. There is no memory for s. THere is nowhere to point to.

If you want a pchar for #0, then you need to explicitly create the string (or memory) that contains it.
Code: Pascal  [Select][+][-]
  1. var s: string;
  2. s:=#0; // length = 1
  3. p:=pchar(s);  // length = 0
since a pchar has no build in len, it will end at the #0

(You can also use getmem).



EDIT:
Ok need to correct myself.

According to your code, fpc does the extra work. So then yes, pchar is save in that case.
« Last Edit: April 09, 2021, 03:08:53 pm by Martin_fr »

tomitomy

  • Sr. Member
  • ****
  • Posts: 251
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #6 on: April 12, 2021, 03:54:49 am »
Ok need to correct myself.

According to your code, fpc does the extra work. So then yes, pchar is save in that case.

Thank you all the same!

tomitomy

  • Sr. Member
  • ****
  • Posts: 251
Re: A bug in Lazarus Debug mode
« Reply #7 on: April 12, 2021, 03:55:44 am »
BlockWrite(F, Pchar(S)^, length(S));

Thank you!

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #8 on: April 12, 2021, 08:38:21 pm »
Code: Pascal  [Select][+][-]
  1. s := '';
  2. p := Pchar(s)
P will be nil too. There is no memory for s. THere is nowhere to point to.
For long strings, this is not the case. PChar(s) in this case is a function that checks for nil, and if so, changes result to a pointer to a memory area with #0.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #9 on: April 13, 2021, 09:13:34 am »
Code: Pascal  [Select][+][-]
  1. s := '';
  2. p := Pchar(s)
P will be nil too. There is no memory for s. THere is nowhere to point to.
For long strings, this is not the case. PChar(s) in this case is a function that checks for nil, and if so, changes result to a pointer to a memory area with #0.

It's not a function, neither in the language sense nor in the sense of the compiler. The compiler simply does a more complex type conversion.

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: [SOLVED]Difference between S[1] and PChar(S)^
« Reply #10 on: April 13, 2021, 11:07:37 am »
Be aware that the empty example may get you into trouble if you do not understand the difference between PChar and a Pascal string:
Code: Pascal  [Select][+][-]
  1. {$H+}
  2. var
  3.   s:string = 'testme'#0'And test me again';
  4.   p:Pchar;
  5.  begin
  6.    writeln(s);
  7.    p:=PChar(s); // here it gets funny for Pascal programmers and  put C programmers where they belong...
  8.    writeln(p);
  9.    readln;
  10.  end.
Now look at the output...... :o
(That's why Pascal long strings are so much more powerfull than C strings )

« Last Edit: April 13, 2021, 11:19:11 am by Thaddy »
Specialize a type, not a var.

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #11 on: April 13, 2021, 06:26:46 pm »
It's not a function, neither in the language sense nor in the sense of the compiler. The compiler simply does a more complex type conversion.
It doesn't matter what you call it. In Delphi, these are intrinsic routines like Length or Abs. In FPC it's complex type conversation.
It is important to understand that unlike other pointer type conversions, the pointer value can change here, not just the type.
Example:
Code: Pascal  [Select][+][-]
  1. {$MODE OBJFPC}
  2. {$LONGSTRINGS ON}
  3.  
  4. uses Windows;
  5.  
  6. procedure RunCmd;
  7. var
  8.   SA: TStartupInfoA;
  9.   PI: TProcessInformation;
  10.   Environment: string = ''; // Use parent
  11. begin
  12.   ZeroMemory(@SA, SizeOf(SA));
  13.   SA.cb := SizeOf(SA);
  14.   CreateProcessA(nil, 'cmd.exe /k', nil, nil, False, 0, PChar(Environment), nil, @SA, @PI);
  15. end;
  16.  
  17. begin
  18.   RunCmd;
  19. end.
After running cmd with this program, it contains only auto-generated environment variables, no %temp% and others. Only because an unwanted conversion was performed here.

The fix is simple: replace PChar(Environment) to PChar(Pointer(Environment)) or simple Pointer(Environment).

Many WinAPI functions distinguish NULL from a zero-length string (from the point of view of the C language).
And it is important for developers to know and understand this.

It is a pity that unlike Delphi, this is not documented in FPC.

By the way, for the topic, it is better to answer like this: BlockWrite(F, Pointer(S)^, Length(S));

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #12 on: April 14, 2021, 08:57:32 am »
It is a pity that unlike Delphi, this is not documented in FPC.

How about you file a bug against the documentation then? In addition to that it could be mentioned how constant strings are handled in regards to overloading if both PAnsiChar and PWideChar are involved.

ASerge

  • Hero Member
  • *****
  • Posts: 2222
Re: [SOLVED, not a bug]A bug in Lazarus Debug mode
« Reply #13 on: April 17, 2021, 09:52:45 am »
How about you file a bug against the documentation then?

Apologies, this is already documented. Single-byte String types:
Quote
The internal representation as a pointer, and the automatic null-termination make it possible to typecast an ansistring to a pchar. If the string is empty (so the pointer is Nil) then the compiler makes sure that the typecast pchar will point to a null byte.
and from Multi-byte String types
Quote
Unicodestrings (used to represent unicode character strings) are implemented in much the same way as ansistrings: reference counted, null-terminated arrays, only they are implemented as arrays of WideChars instead of regular Chars.
...
The Widestring type (used to represent unicode character strings in COM applications) is implemented in much the same way as Unicodestring on Windows, and on other platforms, they are simply the same type.

 

TinyPortal © 2005-2018