Recent

Author Topic: Unneeded copy of string  (Read 1900 times)

ASerge

  • Hero Member
  • *****
  • Posts: 2411
Re: Unneeded copy of string
« Reply #15 on: April 17, 2025, 03:55:32 am »
This mean all project should be built with {T+} option. This also mean we can't use overloaded functions in units that rely on strings.
I think behavior of compiler around constref should be changed. As I suppouse idea of constref is that we reference something, but not change it. So compiler should not do copy of a string on constref.
You've mixed two different things.
1. The issue from your example, when the compiler does not warn that there is a conflict when choosing from two overloaded functions, but chooses the first one that comes to hand. According to the documentation, the @ operator for a variable has the Pointer type in the {$T-} state, which means both functions are suitable.
2. Using the UniqueString procedure before calling a function that uses part of a string. This is not a issue. If you want to avoid unnecessary calls, cast it to the PChar type:
Code: Pascal  [Select][+][-]
  1. P1(PChar(Pointer(S))[0]);
Code: ASM  [Select][+][-]
  1. # [31] P1(PChar(Pointer(S))[0]);
  2.         movq    TC_$P$PROGRAM_$$_S(%rip),%rcx
  3.         call    P$PROGRAM_$$_P1$CHAR
  4.  
In both cases, constref has nothing to do with it.

LemonParty

  • Full Member
  • ***
  • Posts: 170
Re: Unneeded copy of string
« Reply #16 on: April 17, 2025, 08:13:11 am »
Code: Pascal  [Select][+][-]
  1. P1(PChar(Pointer(S))[0]);
This code is ugly.
Can anyone show an example when copying a string on constref is a necessery move?
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Khrys

  • Full Member
  • ***
  • Posts: 229
Re: Unneeded copy of string
« Reply #17 on: April 17, 2025, 09:17:50 am »
Putting aside the discussion whether  constref  should copy the underlying string, what exactly are you trying to do?

Are you writing functions that utilize character pointers as C strings? What are these functions supposed to do?

From what I've gathered, you want overloads for both ansi and wide strings, but without either  {$T+}  or  constref  the compiler can't choose the correct overload, so you insisted on abusing  constref  just to help the compiler discern types, while the actual point of  constref  is to guarantee immutable pass-by-reference.

Coming back to the  constref  discussion, I think it's reasonable for the compiler to ensure that references are valid. For this purpose FPC has  varout  and  constref  while C++ has its own kind of non-nullable references (e.g.  const char&). If you don't need such guarantees, just use a plain pointer instead.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11171
  • Debugger - SynEdit - and more
    • wiki
Re: Unneeded copy of string
« Reply #18 on: April 17, 2025, 09:58:57 am »
Can anyone show an example when copying a string on constref is a necessery move?

It all depends on the level of optimization, and other settings. Maybe even compiler version.
To demonstrate the effect of the missing "uniquestring", I used the PAnsiChar version.

The below passes a pointer to 'a', but prints '5' (which is something random that went in place of the released string).

If you comment the first line, and uncomment the 2nd, then it prints 's', the first char of the 2nd string.

Code: Pascal  [Select][+][-]
  1. Program foo; {$mode objfpc}{$H+}
  2. uses SysUtils;
  3.  
  4. var
  5.   S, S2: AnsiString;
  6.  
  7. //procedure P1(constref Ch: AnsiChar);
  8. procedure P1( Ch: PAnsiChar);
  9. begin
  10.   s := '';  s2 := 'something else' + IntToStr(random(9)); // avoid compiletime const eval
  11.   //s := 'something else' + IntToStr(random(9)); // avoid compiletime const eval
  12.   Writeln(Ch^);
  13. end;
  14.  
  15. begin
  16.   s := 'abcdef' + inttostr(random(9)); // avoid compiletime const eval
  17.   P1(@S[1]);
  18.  
  19.   readln;
  20. end.
  21.  



In real life, I have seen stuff like this when using "const s: ansistring"
- because that is a much more common construct
- because that does not do the uniquesting (in fact it is used because it does not)

And last time I saw it, the string was a field in an object. It wasn't directly modified by the called procedure, but that procedure had callbacks (events that it called, and about 10 calls deep into the stack on the callback was some code that modified the field.



@PascalDragon: I only partly agree with your statement.

1) UniqueString is overkill. (for constref)

It is not only a reference. It is also const.

It does only need an Increment-refcount. If anyone else writes to it, and the refcount is greater 1 then the other will make a copy of their own.
Sure you can bypass that with pchar magic. But then you intentionally break protection.

The general idea is, that having any code that holds a ref to a string, can trust it. If the caller in this case holds that ref, then the callee is fine by those means.

( That differs for a "var c: ansichar" param (or "out"), because then "c" itself can be changed, and at that point it must act on a uniquely referenced copy of the string)

2) It is inconsistent....
While it is nice that there is safety first...

constref is a form of const.

const does not add that form of protection.

In fact it has been stated countless times, that const is a contract where the user (programmer) tells the compiler, that the variable will not change.
And "will not change" includes, that it will not be changed by any code while the called code is running (has not returned to its caller).

With const too, as I stated other code may change (breaking the contract) a string that was passed without protection. And that is by design.

Why is that design not applied for constref?
I may have missed something, but I though constref is the same optimization as const? Except for adding a ref?



EDIT /APPEND

And then, if the string is protected because "constref" takes a pointer (reference) to a char in the string, then why is it not protected if
Code: Pascal  [Select][+][-]
  1. @s[1]
takes the same pointer? (only without read-only protection...)
« Last Edit: April 17, 2025, 10:05:09 am by Martin_fr »

LemonParty

  • Full Member
  • ***
  • Posts: 170
Re: Unneeded copy of string
« Reply #19 on: April 17, 2025, 10:24:05 am »
Quote
Putting aside the discussion whether  constref  should copy the underlying string, what exactly are you trying to do?

I am writing a library. And I want to have overload of a function that can handle both AnsiChar and WideChar buffers. They look like this:
Code: Pascal  [Select][+][-]
  1. function Pos(constref Buf: AnsiChar; C: AnsiChar; Range: SizeUInt): SizeUInt;
  2. function Pos(constref Buf: WideChar; C: WideChar; Range: SizeUInt): SizeUInt;
  3.  
Buf in this case is not a single character, but a pointer to the buffer. This works fine, until we get to strings.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

LemonParty

  • Full Member
  • ***
  • Posts: 170
Re: Unneeded copy of string
« Reply #20 on: April 17, 2025, 10:35:53 am »
Quote
And then, if the string is protected because "constref" takes a pointer (reference) to a char in the string, then why is it not protected if
This express my point very well. It is illogical that constref protect when taking a pointer, but taking a pointer not protect.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Thaddy

  • Hero Member
  • *****
  • Posts: 16982
  • Ceterum censeo Trump esse delendam
Re: Unneeded copy of string
« Reply #21 on: April 17, 2025, 12:12:08 pm »
It all depends on the level of optimization, and other settings. Maybe even compiler version.
Optimization should never affect result.
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11171
  • Debugger - SynEdit - and more
    • wiki
Re: Unneeded copy of string
« Reply #22 on: April 17, 2025, 12:45:48 pm »
It all depends on the level of optimization, and other settings. Maybe even compiler version.
Optimization should never affect result.

Yes, and actually in the example that is obeyed. Optimization does not affect the result. The result of the example I gave is undefined. That is according to the documentation.

Optimization does not affect the undefined-ness. It only affect how it can be observed / how it manifests.

But that is ok. Undefined is undefined. In fact, it would not be undefined, if it would always manifest in one and the same way.

LemonParty

  • Full Member
  • ***
  • Posts: 170
Re: Unneeded copy of string
« Reply #23 on: April 17, 2025, 06:11:21 pm »
Do trunk have this problem?
Should I report this to bugtracker?
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11171
  • Debugger - SynEdit - and more
    • wiki
Re: Unneeded copy of string
« Reply #24 on: April 17, 2025, 06:50:40 pm »
Do trunk have this problem?
Should I report this to bugtracker?

For your original snippet of code, todays fpc 3.3.1 at eba0624535cc504fcaf367055cd3adeab56097a4 generates the uniquestring call.

As for "problem", it may or may not be. PascalDragon is part of the FPC team. They have to decide what the desired behaviour is.  (I am part of the Lazarus team, I just communicated my personal thoughts on the topic).

LemonParty

  • Full Member
  • ***
  • Posts: 170
Re: Unneeded copy of string
« Reply #25 on: April 17, 2025, 09:23:09 pm »
I found temporary decision. If call procedure this way
Code: Pascal  [Select][+][-]
  1. P(PChar( @S[1] )^)
compiler not put copying.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11171
  • Debugger - SynEdit - and more
    • wiki
Re: Unneeded copy of string
« Reply #26 on: April 17, 2025, 09:28:06 pm »
You can have that a bit shorter
Code: Pascal  [Select][+][-]
  1. P1(pchar(S)[0]);

Mind the index changes to 0-based.

And you can shorten it with "type p=pchar;" / though I wouldn't.

nanobit

  • Full Member
  • ***
  • Posts: 173
Re: Unneeded copy of string
« Reply #27 on: April 17, 2025, 10:20:34 pm »
This mean all project should be built with {T+} option.

{$T+} is an old invention, and has some caveat in modern Pascal (since pointermath):
Most programmers are accustomed to think that "@" means untyped ({$T-} default).
{$T+} allows type-checking (matching) in pointer-assignments,
but also changes "@" behavior (to typed pointer) in pointermath (generated code).
Therefore I use {$T-} only.

ASerge

  • Hero Member
  • *****
  • Posts: 2411
Re: Unneeded copy of string
« Reply #28 on: April 17, 2025, 10:47:33 pm »
Code: Pascal  [Select][+][-]
  1. P1(pchar(S)[0]);
The same as I indicated above, only my version is more optimal, but @LemonParty rejected it.

 

TinyPortal © 2005-2018