Recent

Author Topic: To optimize RTL, make function PCharToInt  (Read 37126 times)

PascalDragon

  • Hero Member
  • *****
  • Posts: 5444
  • Compiler Developer
Re: To optimize RTL, make function PCharToInt
« Reply #30 on: July 26, 2021, 01:45:20 pm »
Like it or not you seem to keep defending bugs and claimed features that just don't work.

 Also bad code generation with blanket statements in return.

Like it or not, but you either never provide evidence for bugs or you don't believe or ignore us when we say it's not a bug, but instead you doing something that can even fail on Delphi (with proof by us!).

rsz

  • New Member
  • *
  • Posts: 45
Re: To optimize RTL, make function PCharToInt
« Reply #31 on: July 27, 2021, 06:37:02 pm »
FPC already seems to have the concept of partial arrays and partial strings, in other languages they're called slices.

Unfortunately I can't seem to find much documentation about partial strings. It is only mentioned briefly in the Free Pascal Reference Guide but it seems like something that would be fitting to use here instead of Copy but unfortunately it seems to do a conversion (and allocation?):

In this code for example:
Code: Pascal  [Select][+][-]
  1. procedure SomeProc;
  2.   procedure Proc1(const XS: array of Integer); begin end;
  3.   procedure Proc2(const Str: String); begin end;
  4. var
  5.   A: array of Integer = (1, 2, 3, 4, 5, 6, 7, 8);
  6.   B: String = 'Hello world!';
  7. begin
  8.   Proc1(A[0..1]); // passes (1, 2) to Proc1
  9.   Proc2(B[1..5]); // passes 'Hello' to Proc2
  10. end;

partial strings seem to do a conversion (fpc_chararray_to_ansistr), unlike partial arrays:
Code: ASM  [Select][+][-]
  1. project1.lpr:52                           Proc1(A[0..1]); // passes (1, 2) to Proc1
  2. 0000000000401334 488b5df8                 mov    -0x8(%rbp),%rbx
  3. 0000000000401338 48c7c600000000           mov    $0x0,%rsi
  4. 000000000040133F 488b7df8                 mov    -0x8(%rbp),%rdi
  5. 0000000000401343 e878010100               callq  0x4114c0 <fpc_dynarray_rangecheck>
  6. 0000000000401348 4889de                   mov    %rbx,%rsi
  7. 000000000040134B 48c7c201000000           mov    $0x1,%rdx
  8. 0000000000401352 4889ef                   mov    %rbp,%rdi
  9. 0000000000401355 e8c6000000               callq  0x401420 <PROC1>
Code: ASM  [Select][+][-]
  1. project1.lpr:53                           Proc2(B[1..5]); // passes 'Hello' to Proc2
  2. 000000000040135A 488b5df0                 mov    -0x10(%rbp),%rbx
  3. 000000000040135E 48c7c601000000           mov    $0x1,%rsi
  4. 0000000000401365 488b7df0                 mov    -0x10(%rbp),%rdi
  5. 0000000000401369 e832b20000               callq  0x40c5a0 <fpc_ansistr_rangecheck>
  6. 000000000040136E 4889de                   mov    %rbx,%rsi
  7. 0000000000401371 41b001                   mov    $0x1,%r8b
  8. 0000000000401374 b900000000               mov    $0x0,%ecx
  9. 0000000000401379 48c7c204000000           mov    $0x4,%rdx
  10. 0000000000401380 488d7d88                 lea    -0x78(%rbp),%rdi
  11. 0000000000401384 e807ad0000               callq  0x40c090 <fpc_chararray_to_ansistr>
  12. 0000000000401389 488b7588                 mov    -0x78(%rbp),%rsi
  13. 000000000040138D 4889ef                   mov    %rbp,%rdi
  14. 0000000000401390 e84b000000               callq  0x4013e0 <PROC2>
  15. 0000000000401395 e8c6340100               callq  0x414860 <fpc_popaddrstack>

fpc_chararray_to_ansistr:
https://github.com/graemeg/freepascal/blob/ef144e079c01ce9c6b54af9d348b70bcd47a1c58/rtl/inc/astrings.inc#L578

I did some rudimentary and quick non-scientific benchmarking and it seems to be unfortunately as slow as a copy on my machine. I expected it (maybe wrongly) to be as fast as passing a pointer with a range.

Edit: They should be const.
« Last Edit: July 27, 2021, 08:10:30 pm by rsz »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11351
  • FPC developer.
Re: To optimize RTL, make function PCharToInt
« Reply #32 on: July 27, 2021, 06:55:48 pm »
1. A subrange of a shortstring is probably an array of char, so proc2 probably needs to be array of char not string
2. the arguments of proc1 (XS and StR) should probably be const.

Code: [Select]
.Ll7:
movl TC_$P$PROGRAM$_$SOMEPROC_$$_defaultB,%edx
leal -36(%ebp),%eax
call fpc_ansistr_assign
// this optimization is weird, since this snapshot has no avx2 ?
vmovdqu TC_$P$PROGRAM$_$SOMEPROC_$$_defaultA,%ymm0
vmovdqu %ymm0,-32(%ebp)
.Ll8:
leal -32(%ebp),%edx
movl $1,%ecx
call P$PROGRAM$_$SOMEPROC_$$_PROC1$array_of_LONGINT
.Ll9:
movl -36(%ebp),%edx
movl $4,%ecx
call P$PROGRAM$_$SOMEPROC_$$_PROC2$array_of_CHAR
.Ll10:

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: To optimize RTL, make function PCharToInt
« Reply #33 on: July 27, 2021, 07:27:30 pm »
Hi!

What is often forgetten is the ability of fpc to use only a slice of an array as parameter.

Example:

Code: Pascal  [Select][+][-]
  1. Type arrayOfTpoint = array of TPoint;
  2. var  ar :arrayOfTpoint;
  3. begin
  4.   setLength(ar,123);
  5.   //.....
  6.   Image1.Canvas.polyline(ar[23..42]);
  7.  
  8. end;              

In the example only the points 23..42 are used for drawing.
So you don't need to use copy or slice but you can do the same with the range of the parameter.

Winni

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: To optimize RTL, make function PCharToInt
« Reply #34 on: July 27, 2021, 07:40:12 pm »
But it looks like the called function makes a copy with getmem and move:

Code: Pascal  [Select][+][-]
  1.   procedure Proc1(XS: array of Integer);
  2.   begin
  3.     writeln(length(XS));
  4.   end;
  5.  
  6.  
Code: ASM  [Select][+][-]
  1. project1.lpr:3                            begin
  2. 0000000000401090 55                       push   %rbp
  3. 0000000000401091 4889e5                   mov    %rsp,%rbp
  4. 0000000000401094 488da42470ffffff         lea    -0x90(%rsp),%rsp
  5. 000000000040109C 48899d78ffffff           mov    %rbx,-0x88(%rbp)
  6. 00000000004010A3 4c896580                 mov    %r12,-0x80(%rbp)
  7. 00000000004010A7 4c896d88                 mov    %r13,-0x78(%rbp)
  8. 00000000004010AB 4c897590                 mov    %r14,-0x70(%rbp)
  9. 00000000004010AF 48897df8                 mov    %rdi,-0x8(%rbp)
  10. 00000000004010B3 4889f3                   mov    %rsi,%rbx
  11. 00000000004010B6 4c8b65f8                 mov    -0x8(%rbp),%r12
  12. 00000000004010BA 4c8d6b01                 lea    0x1(%rbx),%r13
  13. 00000000004010BE 49c1e502                 shl    $0x2,%r13
  14. 00000000004010C2 4c89ef                   mov    %r13,%rdi
  15. 00000000004010C5 e876760100               callq  0x418740 <fpc_getmem>
  16. 00000000004010CA 4989c6                   mov    %rax,%r14
  17. 00000000004010CD 4c89ea                   mov    %r13,%rdx
  18. 00000000004010D0 4c89f6                   mov    %r14,%rsi
  19. 00000000004010D3 4c89e7                   mov    %r12,%rdi
  20. 00000000004010D6 e835040000               callq  0x401510 <SYSTEM_$$_MOVE$formal$formal$INT64>
  21. 00000000004010DB 4c8975f8                 mov    %r14,-0x8(%rbp)
  22. 00000000004010DF 488d55e0                 lea    -0x20(%rbp),%rdx
  23. 00000000004010E3 488d75a0                 lea    -0x60(%rbp),%rsi
  24. 00000000004010E7 bf01000000               mov    $0x1,%edi
  25. 00000000004010EC e83f2f0100               callq  0x414030 <fpc_pushexceptaddr>
  26. 00000000004010F1 4889c7                   mov    %rax,%rdi
  27. 00000000004010F4 e8670d0000               callq  0x401e60 <fpc_setjmp>
  28. 00000000004010F9 4863d0                   movslq %eax,%rdx
  29. 00000000004010FC 48895598                 mov    %rdx,-0x68(%rbp)
  30. 0000000000401100 85c0                     test   %eax,%eax
  31. 0000000000401102 7528                     jne    0x40112c <PROC1+156>
  32. project1.lpr:4                            writeln(length(XS));
  33. 0000000000401104 e887bc0100               callq  0x41cd90 <fpc_get_output>
  34. 0000000000401109 4989c4                   mov    %rax,%r12
  35. 000000000040110C 488d5301                 lea    0x1(%rbx),%rdx
  36. 0000000000401110 4c89e6                   mov    %r12,%rsi
  37. 0000000000401113 31ff                     xor    %edi,%edi
  38. 0000000000401115 e8f6c80100               callq  0x41da10 <fpc_write_text_sint>
  39. 000000000040111A e871610100               callq  0x417290 <fpc_iocheck>
  40. 000000000040111F 4c89e7                   mov    %r12,%rdi
  41. 0000000000401122 e809bf0100               callq  0x41d030 <fpc_writeln_end>
  42. 0000000000401127 e864610100               callq  0x417290 <fpc_iocheck>
  43. 000000000040112C e82f320100               callq  0x414360 <fpc_popaddrstack>
  44. project1.lpr:5                            end;
  45. 0000000000401131 488b7df8                 mov    -0x8(%rbp),%rdi
  46. 0000000000401135 e826760100               callq  0x418760 <fpc_freemem>
  47. 000000000040113A 488b4598                 mov    -0x68(%rbp),%rax
  48. 000000000040113E 4885c0                   test   %rax,%rax
  49. 0000000000401141 7405                     je     0x401148 <PROC1+184>
  50. 0000000000401143 e8a8330100               callq  0x4144f0 <fpc_reraise>
  51. 0000000000401148 488b9d78ffffff           mov    -0x88(%rbp),%rbx
  52. 000000000040114F 4c8b6580                 mov    -0x80(%rbp),%r12
  53. 0000000000401153 4c8b6d88                 mov    -0x78(%rbp),%r13
  54. 0000000000401157 4c8b7590                 mov    -0x70(%rbp),%r14
  55. 000000000040115B 4889ec                   mov    %rbp,%rsp
  56. 000000000040115E 5d                       pop    %rbp
  57. 000000000040115F c3554889e5488da42470ffff retq  
  58.  
  59.  


winni

  • Hero Member
  • *****
  • Posts: 3197
Re: To optimize RTL, make function PCharToInt
« Reply #35 on: July 27, 2021, 07:55:40 pm »
But it looks like the called function makes a copy with getmem and move:

Code: Pascal  [Select][+][-]
  1.   procedure Proc1(XS: array of Integer);
  2.   begin
  3.     writeln(length(XS));
  4.   end;
  5.  
  6.  

That is one of the first lessons in Pascal:

The parameter of a procedure working with a local copy.
If you want to avoid that change it to

 
Code: Pascal  [Select][+][-]
  1.   procedure Proc1( var XS: array of Integer);
  2.   begin
  3.     writeln(length(XS));
  4.   end;
  5.  
  6.  

Winni

rsz

  • New Member
  • *
  • Posts: 45
Re: To optimize RTL, make function PCharToInt
« Reply #36 on: July 27, 2021, 07:56:19 pm »
1. A subrange of a shortstring is probably an array of char, so proc2 probably needs to be array of char not string
2. the arguments of proc1 (XS and StR) should probably be const.
1. It indeed seems to be and in the case of an overloaded function with the option of either string or array of char, the compiler chooses the one with array of char.
2. You're right they should be const and the compiler is able to optimize the string slice to const array of char to pass it as fast as passing a pointer with separate range as parameters. This is very good to know 8-)! Unfortunately the compiler does not seem to be able to optimize a string slice when passed to a const string parameter the same way. If it were able to do this then the copy function in the first post on the first page could be replaced with partial strings (subrange).

What is often forgetten is the ability of fpc to use only a slice of an array as parameter.
I didn't know about it up until yesterday :-[.
« Last Edit: July 27, 2021, 08:11:43 pm by rsz »

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: To optimize RTL, make function PCharToInt
« Reply #37 on: July 27, 2021, 08:23:15 pm »
@rsz

Hi!

As a string is nothing but an special array of char you can use the shown range for the parameter also with a string:

Code: Pascal  [Select][+][-]
  1. const hello = 'Hello fpc Users!';  
  2. ....
  3. showMessage (hello[7..9]);
  4.  
  5.  

This shows fpc

Winni

Kays

  • Hero Member
  • *****
  • Posts: 569
  • Whasup!?
    • KaiBurghardt.de
Re: To optimize RTL, make function PCharToInt
« Reply #38 on: July 27, 2021, 10:18:42 pm »
FPC already seems to have the concept of partial arrays and partial strings, in other languages they're called slices.

Unfortunately I can't seem to find much documentation about partial strings. […]
Those are substrings and they are standardized in ISO 10206 “Extended Pascal”. It’s an FPC extension that this syntax works on other arrays too.
Yours Sincerely
Kai Burghardt

rsz

  • New Member
  • *
  • Posts: 45
Re: To optimize RTL, make function PCharToInt
« Reply #39 on: July 28, 2021, 01:40:03 am »
Hi winni!

Thanks. I wasn't aware of it until yesterday but I understand how to use it. I'm trying to figure out how it works internally as it does not seem to be documented in the Free Pascal Reference Guide, see https://forum.lazarus.freepascal.org/index.php/topic,55472.msg413300.html#msg413300 . So far it seems that a slice of a string is handled like an array of char and when passed to a function/procedure which accepts a const string it has to do a conversion.

FPC already seems to have the concept of partial arrays and partial strings, in other languages they're called slices.

Unfortunately I can't seem to find much documentation about partial strings. […]
Those are substrings and they are standardized in ISO 10206 “Extended Pascal”. It’s an FPC extension that this syntax works on other arrays too.
I'll have a read, thank you.

In case anyone is interested, here are my benchmark results. Source code is attached.
The functions are declared as:
Code: Pascal  [Select][+][-]
  1. function FuncAcceptingString(const S: String): Integer;
  2. function FuncAcceptingArray(const S: array of Char): Integer;
  3. function FuncAcceptingPChar(const S: PChar; const FromIndex, ToIndex: Integer): Integer;
And the results are:
Quote
Running benchmarks 1000000000 times per function...

FuncAcceptingString(Copy(InputString, 2, 4))) finished in 18.967 seconds
FuncAcceptingString(InputString[2..5])) finished in 14.641 seconds
FuncAcceptingArray(InputArray[1..4])) finished in 1.776 seconds
FuncAcceptingArray(InputString[2..5])) finished in 1.751 seconds
FuncAcceptingPChar(PChar(InputString), 1, 4)) finished in 1.743 seconds

All accumulators are equal to: 212000000000
That passing a string with copy is the slowest is not really a surprise, but the overhead of a partial string passed to a function with const string compared to an array is rather unexpected when you've never used it before. I have expected it to perform more like passing a pointer.

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: To optimize RTL, make function PCharToInt
« Reply #40 on: July 28, 2021, 03:48:20 am »
[...] but the overhead of a partial string passed to a function with const string compared to an array is rather unexpected when you've never used it before. I have expected it to perform more like passing a pointer.

It's not so surprising; one just has to remember that despite the sintactic similarities the internal structure of strings is different from that of arrays: they always have that more or less complex record before the "char array" itself and the compiler has to build that (so in fact a whole new string) before using it as a parameter, it can't just pass an "array of char" where the function/procedure expects a full blown "String"
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5444
  • Compiler Developer
Re: To optimize RTL, make function PCharToInt
« Reply #41 on: July 28, 2021, 09:11:32 am »
FPC already seems to have the concept of partial arrays and partial strings, in other languages they're called slices.

Slices won't help here, because the relevant functions that Alextp mentioned (e.g. SameText or LowerCase) take a String and not an array of Char, thus there will be a copy of the string anyway.

[...] but the overhead of a partial string passed to a function with const string compared to an array is rather unexpected when you've never used it before. I have expected it to perform more like passing a pointer.

It's not so surprising; one just has to remember that despite the sintactic similarities the internal structure of strings is different from that of arrays: they always have that more or less complex record before the "char array" itself and the compiler has to build that (so in fact a whole new string) before using it as a parameter, it can't just pass an "array of char" where the function/procedure expects a full blown "String"

Just to clear up potential confusion:
  • dynamic arrays also have an internal data header that contains the length and the reference count, similar to strings
  • open arrays consist of the pointer to the start of the array and the highest index both passed as parameter to a function

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: To optimize RTL, make function PCharToInt
« Reply #42 on: July 29, 2021, 06:18:45 pm »

That is one of the first lessons in Pascal:

The parameter of a procedure working with a local copy.

Right. Unless it is a dynamic array parameter

If you want to avoid that change it to

 
Code: Pascal  [Select][+][-]
  1.   procedure Proc1( var XS: array of Integer);
  2.   begin
  3.     writeln(length(XS));
  4.   end;
  5.  
  6.  



That is much better


Now if local variables or result could be declared to be open arrays

And if there were typehelpers for (open) arrays





Slices won't help here, because the relevant functions that Alextp mentioned (e.g. SameText or LowerCase) take a String and not an array of Char, thus there will be a copy of the string anyway.

Then perhaps we need new overloaded functions

Especially an open array version of Val

PascalDragon

  • Hero Member
  • *****
  • Posts: 5444
  • Compiler Developer
Re: To optimize RTL, make function PCharToInt
« Reply #43 on: July 30, 2021, 09:13:34 am »
Now if local variables or result could be declared to be open arrays

No. Those would need to be allocated on the heap (especially the return value) and then you can just as well use a dynamic array.

Slices won't help here, because the relevant functions that Alextp mentioned (e.g. SameText or LowerCase) take a String and not an array of Char, thus there will be a copy of the string anyway.

Then perhaps we need new overloaded functions

No, we don't. This is also about maintainability. And adding functions for everything and the kitchen sink is not maintainable. Also the main types for strings in Object Pascal are the various String types and not arrays of Char.

Especially an open array version of Val

And what would such a Val do?

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: To optimize RTL, make function PCharToInt
« Reply #44 on: July 30, 2021, 10:40:42 am »

No, we don't. This is also about maintainability. And adding functions for everything and the kitchen sink is not maintainable. Also the main types for strings in Object Pascal are the various String types and not arrays of Char.

Thanx to PascalDragon!

And for those who are eager to optimize the whole day and night:

Pascal is not the right choice.
Use Forth.
There you can optimize everything.
On speed. On size. On everything.

Winni

 

TinyPortal © 2005-2018