Lazarus

Free Pascal => General => Topic started by: Alextp on September 29, 2020, 11:20:24 pm

Title: Small speedup for FExpand function
Post by: Alextp on September 29, 2020, 11:20:24 pm
fpcsrc/rtl/inc/fexpand.inc

Block
Code: Pascal  [Select][+][-]
  1. {$IFNDEF FPC_FEXPAND_DIRSEP_IS_CURDIR}
  2.  {$IFNDEF FPC_FEXPAND_DIRSEP_IS_UPDIR}
  3.     {Before anything else, remove doubled DirectorySeparator characters
  4.      - technically invalid or at least useless, but ignored by most operating
  5.      systems except for plain DOS.}
  6.     I := Pos (DirectorySeparator + DirectorySeparator, Dirs);
  7.     while I <> 0 do
  8.         begin
  9.             J := Succ (I);
  10.             while (Length (Dirs) > J) and (Dirs [Succ (J)] = DirectorySeparator) do
  11.                 Inc (J);
  12.             Delete (Dirs, Succ (I), J - I);
  13.             I := Pos (DirectorySeparator + DirectorySeparator, Dirs);
  14.         end;
  15.  {$ENDIF FPC_FEXPAND_DIRSEP_IS_UPDIR}
  16. {$ENDIF FPC_FEXPAND_DIRSEP_IS_CURDIR}
  17.  

Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

And 4+ similar blocks below that one.
Title: Re: Small speedup for FExpand function
Post by: jamie on September 30, 2020, 03:01:03 am
if you are looking for a speed up in execution then I would use the source string and simply have two indexes, one the trailing index and the second the head index reading the remainder of the line..

 When complete, you reset the line length using the Trailing index..

 the idea is to find a separator and just keep incrementing both indexes until you find one, then if you find another index after that you only increment the HEAD index until you find no more..

 In the process of doing all of this you will be replacing the Trailing characters with the characters the HEAD index finds.

 I can provide an example if you wish.
Title: Re: Small speedup for FExpand function
Post by: PascalDragon on September 30, 2020, 09:56:40 am
Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

No, we can't, because PosEx is part of the unit StrUtils while the code of fexpand.inc is used in both the Dos and SysUtils unit and StrUtils uses the later in its interface section.
Title: Re: Small speedup for FExpand function
Post by: marcov on September 30, 2020, 10:23:50 am
Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

No, we can't, because PosEx is part of the unit StrUtils while the code of fexpand.inc is used in both the Dos and SysUtils unit and StrUtils uses the later in its interface section.

True, but pos with offset is part of system since 3.2.  As soon as 3.0.4 bootstrapping is no longer supported, this can be switched (or needs an ifdef).
Title: Re: Small speedup for FExpand function
Post by: PascalDragon on September 30, 2020, 10:36:49 am
True, but pos with offset is part of system since 3.2.  As soon as 3.0.4 bootstrapping is no longer supported, this can be switched (or needs an ifdef).

Better wait until we've dropped 3.0.4 support, less code clutter that way as fexpand.inc is already full of it. ;)
Title: Re: Small speedup for FExpand function
Post by: jamie on October 01, 2020, 01:41:27 am
here's a base Idea..

Code: Pascal  [Select][+][-]
  1.  
  2. Procedure RepeatCharToSingleChar(Var AStr:String; aRp:Char);
  3. Var
  4.  T,H,L:Integer;
  5. begin
  6.  T:=1;H:=1;
  7.  L:= Length(AStr);
  8.  While H<=L do
  9.   Begin
  10.     AStr[T]:=AStr[H];
  11.     Inc(T);
  12.     IF AStr[H]=aRp Then
  13.     Repeat
  14.      Inc(H);
  15.     until (H>L)or(AStr[H]<>aRp) else inc(H);
  16.   end;
  17. SetLength(AStr, T-1);
  18. end;
  19.  
  20. procedure TForm1.Button1Click(Sender: TObject);
  21. var
  22.   Dirs:String = '\\1\\2\\3\\4\5\6';
  23. begin
  24.  RepeatCharToSingleChar(Dirs,DirectorySeparator);
  25.  Caption := Dirs;
  26. end;
  27.                                              
  28.  
That will remove all repeating chars indicated in the call.
and its done very effectively without using inner functions, deletes etc..

of course this can be enhanced but I think the idea is conveyed
Title: Re: Small speedup for FExpand function
Post by: PascalDragon on October 01, 2020, 09:08:33 am
here's a base Idea..

Code: Pascal  [Select][+][-]
  1.  
  2. Procedure RepeatCharToSingleChar(Var AStr:String; aRp:Char);
  3. Var
  4.  T,H,L:Integer;
  5. begin
  6.  T:=1;H:=1;
  7.  L:= Length(AStr);
  8.  While H<=L do
  9.   Begin
  10.     AStr[T]:=AStr[H];
  11.     Inc(T);
  12.     IF AStr[H]=aRp Then
  13.     Repeat
  14.      Inc(H);
  15.     until (H>L)or(AStr[H]<>aRp) else inc(H);
  16.   end;
  17. SetLength(AStr, T-1);
  18. end;
  19.  
  20. procedure TForm1.Button1Click(Sender: TObject);
  21. var
  22.   Dirs:String = '\\1\\2\\3\\4\5\6';
  23. begin
  24.  RepeatCharToSingleChar(Dirs,DirectorySeparator);
  25.  Caption := Dirs;
  26. end;
  27.                                              
  28.  
That will remove all repeating chars indicated in the call.
and its done very effectively without using inner functions, deletes etc..

You should probably check whether T <> H before doing the assignment, as in most cases this will be False and thus the assignment might be more expensive than the check.
Title: Re: Small speedup for FExpand function
Post by: jamie on October 01, 2020, 11:25:58 pm
You are talking about resetting the length of the string ?

if T <> H then SetLength(…)
?
Title: Re: Small speedup for FExpand function
Post by: Martin_fr on October 01, 2020, 11:43:09 pm
You should probably check whether T <> H before doing the assignment, as in most cases this will be False and thus the assignment might be more expensive than the check.
Or start by finding the first occurrence "pos()" and then go from there.

Also assigning chars to a string, I assume that the code will check for copy-on-write (refcount>1) all the time, despite it is only needed once? (Or does the compiler optimize that?)

So then, if (and only if) you got the first occurrence of the duplicated char, make the string unique, then use pchar.
Title: Re: Small speedup for FExpand function
Post by: jamie on October 02, 2020, 02:09:14 am
It was a base idea which I use many times to delete specific items in a string without using Delete, POS which just piles up the code and CPU time.

 When performing such operations like this, there is no need to call functions or create another string if all you want to do is remove something in multiple places..

 Using a 2 PChars and incrementing the pointer would also be a speed increase for sure.

 The function isn't needed, I just did that to separate it from the code, it could also be marked "inline" to save on the stack operations.

 Since we are dealing with both Single and Wide chars, I guess an overload version would also need to be made.

Title: Re: Small speedup for FExpand function
Post by: jamie on October 02, 2020, 02:38:40 am
Code: Pascal  [Select][+][-]
  1. Procedure RepeatToSingleChar(Var Str:String; aRp:Char);  Inline;
  2. var
  3.   T,H,L:Pchar;
  4. Begin
  5.   T := Pchar(Str);
  6.   H := T;
  7.   IF H <> Nil Then
  8.   While (H^<>#0) do
  9.    Begin
  10.      T^ := H^;
  11.      Inc(T);
  12.      If H^=aRp THen
  13.      Repeat
  14.       Inc(H);
  15.      Until (H^=#0)or(H^<>aRp) else Inc(H);
  16.    end;
  17.  IF T<>H then
  18.   SetLength(Str,Length(Str)-(H-T));
  19. end;                                
  20.  
  21.  

This looks good with the 3.0.4 fpc, didn't try it with 3.2 and up but it actually "Inlines" the code so this can actually be used to remove the call to the procedure
Title: Re: Small speedup for FExpand function
Post by: Martin_fr on October 02, 2020, 04:11:06 am
When you find the first duplicate char, you need to call UniqueString.

Access via pchar, does not trigger copy-on-write.
So if
Code: Pascal  [Select][+][-]
  1. var a,b: String;
  2. begin
  3.   a := GetValue;
  4.   b := a; // reference to same memory
  5.   RepeatToSingleChar(a, '/');
  6.  
will modify a and b.
(Except the SetLength applies only to a.)
Title: Re: Small speedup for FExpand function
Post by: Martin_fr on October 02, 2020, 04:33:53 am
Something roughly like this
Code: Pascal  [Select][+][-]
  1.     Procedure RepeatToSingleChar(Var Str:String; aRp:Char);  Inline;
  2.     var
  3.       T,H,L:Pchar;
  4.       i: integer;
  5.     Begin
  6.       T := Pchar(Str);
  7.       if T = nil then
  8.         exit;
  9.  
  10.       while (T^ <> #0) do
  11.         if (T^ <> aRp) then begin
  12.           inc(T);
  13.         end else begin
  14.           inc(T);
  15.           if (T^ = aRp) then break
  16.         end;
  17.       if T^ = #0 then
  18.         exit;
  19.  
  20.       i := T - Pchar(Str);
  21.       UniqueString(Str);
  22.       T := i + Pchar(Str);
  23.       H := T;
  24.  
  25.       Repeat
  26.        Inc(H);
  27.       Until (H^=#0)or(H^<>aRp);
  28.  
  29.       While (H^<>#0) do
  30.        Begin
  31.          T^ := H^;
  32.          Inc(T);
  33.          If H^=aRp THen
  34.            Repeat
  35.             Inc(H);
  36.            Until (H^=#0)or(H^<>aRp)
  37.          else
  38.            Inc(H);
  39.        end;
  40.       SetLength(Str,Length(Str)-(H-T));
  41.     end;
  42.  
  43.  
Title: Re: Small speedup for FExpand function
Post by: jamie on October 02, 2020, 01:28:55 pm
It looks like u didn't even run my code?
U are making a mountain out of a mole hill.

If u pull a string from a constant then u simply use uniquestring once on the string before entering the code.

There is no issue changing contents of a string like I did.
I actually tested the code before posting it.
 8-)
Title: Re: Small speedup for FExpand function
Post by: Martin_fr on October 02, 2020, 02:11:25 pm
I actually tested the code before posting it.

So did I:

Code: Pascal  [Select][+][-]
  1. var
  2.       Dirs:String = 'b\a\\1\\2\\3\\4\5\6789';
  3.       Dirs2:String;
  4.  
  5. begin
  6.     dirs := copy(dirs,1,111);
  7.     Dirs2:=dirs;
  8.      RepeatToSingleChar(Dirs,DirectorySeparator);
  9.      writeln( Dirs);
  10.      writeln( Dirs2);
  11.      readln;
  12. end.
prints
Code: Pascal  [Select][+][-]
  1. b\a\1\2\3\4\5\6789
  2. b\a\1\2\3\4\5\67896789

But "b" should still have the "\\" in it.

(fpc 3.0.4 / but that should not matter)


Quote
If u pull a string from a constant then u simply use uniquestring once on the string before entering the code.
Well, that is what I wrote. => Only that needs to be in the subroutine.

Because then, this needs only happen, if there needs to be any modification at all.
So if the string has no dups, you save the time of copying it around in memory (in case it had more than one ref)


As for mountain vs "mole hill" => I did not classify it.
I simply stated what happens

I posted my own suggestion, independent on it (and yes, it is based on your code.)
Title: Re: Small speedup for FExpand function
Post by: Bart on October 02, 2020, 02:22:06 pm
How relevant is this discussion about speeding up FExpand?
Since this code is primarily used in preparation of some disk IO, which is orders of magnitude slower than string handling.
So it is very unlikely that FExpand will ever be the bottleneck in your program.

IMO it is better to have good readable code then pre-mature optimized code that is harder to follow (and potentially therefore more prone to new bugs).

Bart
Title: Re: Small speedup for FExpand function
Post by: Martin_fr on October 02, 2020, 02:31:57 pm
How relevant is this discussion about speeding up FExpand?
Since this code is primarily used in preparation of some disk IO, which is orders of magnitude slower than string handling.
So it is very unlikely that FExpand will ever be the bottleneck in your program.

IMO it is better to have good readable code then pre-mature optimized code that is harder to follow (and potentially therefore more prone to new bugs).

I don't really know. But I dont think code like my example is really needed. I only meant it as proof of concept.

I can see it getting used heavily, if file locations are compared (if you have a list of pathes).
But: pathes are usually short. I do not even know, how long a path would be acceptable by most OS.

On a string of 1000 chars the speed diff will probably be negligible.


Title: Re: Small speedup for FExpand function
Post by: PascalDragon on October 02, 2020, 02:35:24 pm
How relevant is this discussion about speeding up FExpand?
Since this code is primarily used in preparation of some disk IO, which is orders of magnitude slower than string handling.
So it is very unlikely that FExpand will ever be the bottleneck in your program.

IMO it is better to have good readable code then pre-mature optimized code that is harder to follow (and potentially therefore more prone to new bugs).

No one can say in what situation the code of fexpand.inc is used, because e.g. ExpandFileName does not do disk I/O by itself (aside from what the code in fexpand.inc itself might do). So while optimizing fexpand.inc might not help everyone, due to a potentially following disk I/O being more expensive, there are definitely cases where it does. If it should for example result in smaller/faster code for embedded platforms that would be a plus, and memory allocations (the Delete in the current code) is definitely more expensive than simply adjusting pointers.
Title: Re: Small speedup for FExpand function
Post by: ASerge on October 02, 2020, 09:28:36 pm
Sometimes there is no need to look for a solution like the C language, we can use regular Pascal, without any #0:
Code: Pascal  [Select][+][-]
  1. procedure CompressDuplicatesToOne(var InStr: string; Sample: Char);
  2. var
  3.   SLength: SizeInt;
  4.   WritePos: SizeInt;
  5.   ReadPos: SizeInt;
  6. begin
  7.   SLength := Length(InStr);
  8.   ReadPos := 1;
  9.   WritePos := 1;
  10.   while ReadPos <= SLength do
  11.   begin
  12.     InStr[WritePos] := InStr[ReadPos];
  13.     Inc(WritePos);
  14.     if InStr[ReadPos] <> Sample then
  15.       Inc(ReadPos)
  16.     else
  17.       repeat
  18.         Inc(ReadPos);
  19.       until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);
  20.   end;
  21.   if WritePos < ReadPos then
  22.     SetLength(InStr, WritePos - 1);
  23. end;
In line 12, the compiler implicitly calls the UniqueString function.
We can optimize this by reducing the readability of the code. We will also replace the value of the WritePos variable, since we now need to use it for PChar:
Code: Pascal  [Select][+][-]
  1. var
  2.   SLength: SizeInt;
  3.   WritePosZeroBase: SizeInt; // Zero index
  4.   ReadPos: SizeInt;
  5. begin
  6.   UniqueString(InStr); // Call only once
  7.   SLength := Length(InStr);
  8.   ReadPos := 1;
  9.   WritePosZeroBase := 0;
  10.   while ReadPos <= SLength do
  11.   begin
  12.     // Cast to PChar to eliminate implicit UniqueString call for InStr[i]:=
  13.     (PChar(Pointer(InStr)) + WritePosZeroBase)^ := InStr[ReadPos];
  14.     Inc(WritePosZeroBase);
  15.     if InStr[ReadPos] <> Sample then
  16.       Inc(ReadPos)
  17.     else
  18.       repeat
  19.         Inc(ReadPos);
  20.       until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);
  21.   end;
  22.   if (WritePosZeroBase + 1) < ReadPos then
  23.     SetLength(InStr, WritePosZeroBase);
  24. end;
Given the fact that modern processors work equally fast with or without index addressing, this code, I think, will be no worse in speed than the code with bringing everything to PChar. It also works for strings containing the #0 character.
At least the assembler text on x64 looks optimal:
Code: ASM  [Select][+][-]
  1. # [57] UniqueString(InStr);
  2.         movq    %rbx,%rcx
  3.         call    FPC_ANSISTR_UNIQUE
  4. .Ll19:
  5. # [58] SLength := Length(InStr);
  6.         movq    (%rbx),%rax
  7.         testq   %rax,%rax
  8.         je      .Lj22
  9.         movq    -8(%rax),%rax
  10. .Lj22:
  11. # Var SLength located in register rax
  12. # Var ReadPos located in register r9
  13. .Ll20:
  14. # [59] ReadPos := 1;
  15.         movl    $1,%r9d
  16. # Var WritePosZeroBase located in register rdx
  17. .Ll21:
  18. # [60] WritePosZeroBase := 0;
  19.         xorl    %edx,%edx
  20. .Ll22:
  21. # [61] while ReadPos <= SLength do
  22.         jmp     .Lj24
  23.         .balign 8,0x90
  24. .Lj23:
  25. .Ll23:
  26.         movq    (%rbx),%rcx
  27. .Ll24:
  28. # [64] (PChar(Pointer(InStr)) + WritePosZeroBase)^ := InStr[ReadPos];
  29.         leaq    (%rcx,%rdx),%r8
  30.         movb    -1(%rcx,%r9,1),%cl
  31.         movb    %cl,(%r8)
  32. .Ll25:
  33. # [65] Inc(WritePosZeroBase);
  34.         addq    $1,%rdx
  35. .Ll26:
  36. # [66] if InStr[ReadPos] <> Sample then
  37.         movq    (%rbx),%rcx
  38.         cmpb    -1(%rcx,%r9,1),%sil
  39.         je      .Lj27
  40. .Ll27:
  41. # [67] Inc(ReadPos)
  42.         addq    $1,%r9
  43.         jmp     .Lj28
  44. .Lj27:
  45.         .balign 8,0x90
  46. .Lj29:
  47. .Ll28:
  48. # [70] Inc(ReadPos);
  49.         addq    $1,%r9
  50. .Ll29:
  51. # [71] until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);
  52.         cmpq    %r9,%rax
  53.         jl      .Lj31
  54.         movq    (%rbx),%rcx
  55.         cmpb    -1(%rcx,%r9,1),%sil
  56.         je      .Lj29
  57. .Lj31:
  58. .Lj28:
  59. .Lj24:
  60. .Ll30:
  61.         cmpq    %r9,%rax
  62.         jge     .Lj23
  63. .Ll31:
  64. # [73] if (WritePosZeroBase + 1) < ReadPos then
  65.         leaq    1(%rdx),%rax
  66.         cmpq    %r9,%rax
  67.         jnl     .Lj36
  68. .Ll32:
  69. # [74] SetLength(InStr, WritePosZeroBase);
  70.         movq    %rbx,%rcx
  71.         xorl    %r8d,%r8d
  72.         call    fpc_ansistr_setlength
  73.         .balign 4,0x90
  74. .Lj36:
  75. .Ll33:
  76. # [75] end;
Title: Re: Small speedup for FExpand function
Post by: marcov on October 02, 2020, 09:37:46 pm
Given the fact that modern processors work equally fast with or without index addressing, this code, I think, will be no worse in speed than the code with bringing everything to PChar. It also works for strings containing the #0

Recent processors re-valuate string functions. Modern processors can process up to 100 bytes per tick (!). While even with uopcache, 6 uops per tick is the maximum

Maybe we need an IndexNByte for this code:

Code: Pascal  [Select][+][-]
  1.  repeat
  2.         Inc(ReadPos);
  3.       until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);

Title: Re: Small speedup for FExpand function
Post by: ASerge on October 03, 2020, 10:17:47 am
Maybe we need an IndexNByte for this code:

Code: Pascal  [Select][+][-]
  1.  repeat
  2.         Inc(ReadPos);
  3.       until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);
Thinking is not necessary, in practice there are not many identical consecutive characters, but the IndexByte function is very long compared to the code:
Code: ASM  [Select][+][-]
  1. .Lj29:
  2. .Ll28:
  3. # [70] Inc(ReadPos);
  4.         addq    $1,%r9
  5. .Ll29:
  6. # [71] until (ReadPos > SLength) or (InStr[ReadPos] <> Sample);
  7.         cmpq    %r9,%rax
  8.         jl      .Lj31
  9.         movq    (%rbx),%rcx
  10.         cmpb    -1(%rcx,%r9,1),%sil
  11.         je      .Lj29
  12. .Lj31:
Title: Re: Small speedup for FExpand function
Post by: marcov on October 03, 2020, 03:32:47 pm
Yeah, and then you get to the topic what CPU such routines should optimize for. Which is not easy.
TinyPortal © 2005-2018