Recent

Author Topic: String related improvements for FPC  (Read 1776 times)

lagprogramming

  • Sr. Member
  • ****
  • Posts: 407
String related improvements for FPC
« on: July 20, 2023, 04:25:48 pm »
The code of FPC has lots of lines like SetLength(AnyKindOfStringType, 0).
When SetLength(AnyKindOfStringType, 0) is replaced with AnyKindOfStringType := '' the compiler produces better code because it avoids calling the SetLength procedure. I couldn't find a string type where the compiler would produce worse code for AnyKindOfStringType := '' than for SetLength(AnyKindOfStringType, 0).
Replacing the SetLength(AnyKindOfStringType, 0) code with AnyKindOfStringType := '' leads to a problem. The patch from an external source would be large enough that I have doubts any FPC developer would commit it to git.
So, is there a FPC developer willing to do this improvement!? I don't think that the compiler will automatically optimize this string related code in the near future. Also, it would be a step forward in having a single pascal programming practice for these particular assignments: StringType := '', not SetLength(StringType, 0).

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1315
Re: String related improvements for FPC
« Reply #1 on: July 20, 2023, 05:05:46 pm »
Why not optimize the SetLength code? Much simpler.

Чебурашка

  • Hero Member
  • *****
  • Posts: 588
  • СЛАВА УКРАЇНІ! / Slava Ukraïni!
Re: String related improvements for FPC
« Reply #2 on: July 20, 2023, 05:31:01 pm »
The idea of optimizing is always good of course.

OTOH there is a sort of symmetry in
 
Code: Pascal  [Select][+][-]
  1. procedure DoIt();
  2. var
  3.   s: string;
  4. begin
  5.   SetLength(s, 5);
  6.   try
  7.     // operations
  8.   finally
  9.     SetLength(s, 0);
  10.   end;
  11. end;
  12.  

Perhaps it would be lost if fpc forces to make ":= nil".

Personally I like this symmetry. But I also admit that I am not a big fan of having two ways of doing the same task, especially when one is less efficient than the other.
« Last Edit: July 20, 2023, 05:32:56 pm by Чебурашка »
FPC 3.2.0/Lazarus 2.0.10+dfsg-4+b2 on Debian 11.5
FPC 3.2.2/Lazarus 2.2.0 on Windows 10 Pro 21H2

BeniBela

  • Hero Member
  • *****
  • Posts: 928
    • homepage
Re: String related improvements for FPC
« Reply #3 on: July 20, 2023, 06:42:05 pm »
The compiler developers messed this symmetry up

Code: Pascal  [Select][+][-]
  1. var
  2.   s: string;
  3. begin
  4.   SetLength(s, 5);

you are not allowed to write that anymore

it gives a warning: "Hint: Local variable "s" does not seem to be initialized"


When SetLength(AnyKindOfStringType, 0) is replaced with AnyKindOfStringType := '' the compiler produces better code because it avoids calling the SetLength procedure.

is that actually true?

then it calls fpc_AnsiStr_Assign

lagprogramming

  • Sr. Member
  • ****
  • Posts: 407
Re: String related improvements for FPC
« Reply #4 on: July 20, 2023, 08:55:18 pm »

When SetLength(AnyKindOfStringType, 0) is replaced with AnyKindOfStringType := '' the compiler produces better code because it avoids calling the SetLength procedure.

is that actually true?

then it calls fpc_AnsiStr_Assign

I don't know why but I remember that the first time a value was assigned to a string variable in a routine and that value was an empty string, the compiler would have completely avoided the call to fpc_stringtype_assign. Instead of calling the assign routine, something like an inlined code was executed. You know, like it does with shortstrings(change TestString to shortstring in the example below and notice there's a single instruction produced instead of going into fpc_shortstr_setlength). With the latest fpc development version I see the assign routine too.  :(

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. type
  4.   TestString = ansistring;
  5.                //shortstring;
  6.                //unicodestring;
  7.                
  8. procedure foo(var param:TestString);
  9. begin
  10.   param:='';
  11. end;
  12.  
  13. procedure bar(var param:TestString);
  14. begin
  15.   setlength(param, 0);
  16. end;
  17.  
  18. var
  19.   L: TestString;
  20. begin
  21.   L:='';
  22.   setlength(L, 0);
  23.   foo(L);
  24.   bar(L);
  25.   writeln(L);
  26. end.
  27.  
Anyway, even with this unpleasant surprise, unlike fpc_ansistr_setlength, fpc_ansistr_assign doesn't additionally call fpc_ansistr_decr_ref.
This is the code executed.
Code: Pascal  [Select][+][-]
  1. fpc_ansistr_assign
  2. 000000000040ADB0 53                       push rbx
  3. 000000000040ADB1 4154                     push r12
  4. 000000000040ADB3 50                       push rax
  5. 000000000040ADB4 4889FB                   mov rbx,rdi
  6. 000000000040ADB7 4989F4                   mov r12,rsi
  7. 000000000040ADBA 483B37                   cmp rsi,[rdi]
  8. 000000000040ADBD 7422                     jz +$22    # $000000000040ADE1 fpc_ansistr_assign+49
  9. ...
  10. 000000000040ADE1 59                       pop rcx
  11. 000000000040ADE2 415C                     pop r12
  12. 000000000040ADE4 5B                       pop rbx
  13. 000000000040ADE5 C3                       ret
vs.
Code: Pascal  [Select][+][-]
  1. fpc_ansistr_setlength
  2. 000000000040BF10 53                       push rbx
  3. 000000000040BF11 4154                     push r12
  4. 000000000040BF13 4155                     push r13
  5. 000000000040BF15 488D6424F0               lea rsp,[rsp-$10]
  6. 000000000040BF1A 4889FB                   mov rbx,rdi
  7. 000000000040BF1D 4989F4                   mov r12,rsi
  8. 000000000040BF20 664189D5                 mov r13w,dx
  9. 000000000040BF24 4885F6                   test rsi,rsi
  10. 000000000040BF27 0F8E0B010000             jle +$0000010B    # $000000000040C038 fpc_ansistr_setlength+296
  11. ...
  12. 000000000040C038 4889DF                   mov rdi,rbx
  13. 000000000040C03B E810EDFFFF               call -$000012F0    # $000000000040AD50 fpc_ansistr_decr_ref
  14. 000000000040C040 488D642410               lea rsp,[rsp+$10]
  15. 000000000040C045 415D                     pop r13
  16. 000000000040C047 415C                     pop r12
  17. 000000000040C049 5B                       pop rbx
  18. 000000000040C04A C3                       ret
  19.  
  20. fpc_ansistr_decr_ref
  21. 000000000040AD50 53                       push rbx
  22. 000000000040AD51 48833F00                 cmp qword ptr [rdi],$00
  23. 000000000040AD55 7429                     jz +$29    # $000000000040AD80 fpc_ansistr_decr_ref+48
  24. ...
  25. 000000000040AD80 5B                       pop rbx
  26. 000000000040AD81 C3                       ret

By using just a few instructions instead, can't FPC replace the call to assign when the first assignment to a variable is an empty string!? Either I'm confusing things, either there was a time when FPC did it. Overall, with those few instructions instead of the call to SetLength, the improvement is welcomed, especially because many developers have the tendency to assign empty string values to variables and function results at the beginning of the routines. They use these assignments as default values, or as workarounds to avoid warnings when using SetLength later in code.
EDIT: Maybe the assign routine was inlined at the time I've done tests in the past and now it's no longer inlined.
« Last Edit: July 20, 2023, 09:04:18 pm by lagprogramming »

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1315
Re: String related improvements for FPC
« Reply #5 on: July 20, 2023, 10:56:40 pm »
So, are they doing the same thing?

lagprogramming

  • Sr. Member
  • ****
  • Posts: 407
Re: String related improvements for FPC
« Reply #6 on: July 21, 2023, 10:41:09 am »
So, are they doing the same thing?
When the string is a shortstring there's no call for assign, but there is a call for setlength.
When assign is called with reference counted strings, unlike setlength, it doesn't additionally call decrease ref.
In addition, the difference between the code produced for assign and setlength is expected to be bigger when targeting processors for which fpc doesn't optimize the code as well as for x86_64.
So, they are not doing the same thing. Is it an improvement? Yes, it is. Is the code improved significantly? No. Does it worth to make the changes? Fpc developers should answer this question.

 

TinyPortal © 2005-2018