Recent

Author Topic: Small speedup for FExpand function  (Read 3550 times)

AlexTP

  • Hero Member
  • *****
  • Posts: 2384
    • UVviewsoft
Small speedup for FExpand function
« on: September 29, 2020, 11:20:24 pm »
fpcsrc/rtl/inc/fexpand.inc

Block
Code: Pascal  [Select][+][-]
  1. {$IFNDEF FPC_FEXPAND_DIRSEP_IS_CURDIR}
  2.  {$IFNDEF FPC_FEXPAND_DIRSEP_IS_UPDIR}
  3.     {Before anything else, remove doubled DirectorySeparator characters
  4.      - technically invalid or at least useless, but ignored by most operating
  5.      systems except for plain DOS.}
  6.     I := Pos (DirectorySeparator + DirectorySeparator, Dirs);
  7.     while I <> 0 do
  8.         begin
  9.             J := Succ (I);
  10.             while (Length (Dirs) > J) and (Dirs [Succ (J)] = DirectorySeparator) do
  11.                 Inc (J);
  12.             Delete (Dirs, Succ (I), J - I);
  13.             I := Pos (DirectorySeparator + DirectorySeparator, Dirs);
  14.         end;
  15.  {$ENDIF FPC_FEXPAND_DIRSEP_IS_UPDIR}
  16. {$ENDIF FPC_FEXPAND_DIRSEP_IS_CURDIR}
  17.  

Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

And 4+ similar blocks below that one.
« Last Edit: September 29, 2020, 11:24:51 pm by Alextp »

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #1 on: September 30, 2020, 03:01:03 am »
if you are looking for a speed up in execution then I would use the source string and simply have two indexes, one the trailing index and the second the head index reading the remainder of the line..

 When complete, you reset the line length using the Trailing index..

 the idea is to find a separator and just keep incrementing both indexes until you find one, then if you find another index after that you only increment the HEAD index until you find no more..

 In the process of doing all of this you will be replacing the Trailing characters with the characters the HEAD index finds.

 I can provide an example if you wish.
The only true wisdom is knowing you know nothing

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Small speedup for FExpand function
« Reply #2 on: September 30, 2020, 09:56:40 am »
Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

No, we can't, because PosEx is part of the unit StrUtils while the code of fexpand.inc is used in both the Dos and SysUtils unit and StrUtils uses the later in its interface section.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Small speedup for FExpand function
« Reply #3 on: September 30, 2020, 10:23:50 am »
Here we can use PosEx(.., I) in 'while' to find next DirectorySeparator pair from previous one.

No, we can't, because PosEx is part of the unit StrUtils while the code of fexpand.inc is used in both the Dos and SysUtils unit and StrUtils uses the later in its interface section.

True, but pos with offset is part of system since 3.2.  As soon as 3.0.4 bootstrapping is no longer supported, this can be switched (or needs an ifdef).

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Small speedup for FExpand function
« Reply #4 on: September 30, 2020, 10:36:49 am »
True, but pos with offset is part of system since 3.2.  As soon as 3.0.4 bootstrapping is no longer supported, this can be switched (or needs an ifdef).

Better wait until we've dropped 3.0.4 support, less code clutter that way as fexpand.inc is already full of it. ;)

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #5 on: October 01, 2020, 01:41:27 am »
here's a base Idea..

Code: Pascal  [Select][+][-]
  1.  
  2. Procedure RepeatCharToSingleChar(Var AStr:String; aRp:Char);
  3. Var
  4.  T,H,L:Integer;
  5. begin
  6.  T:=1;H:=1;
  7.  L:= Length(AStr);
  8.  While H<=L do
  9.   Begin
  10.     AStr[T]:=AStr[H];
  11.     Inc(T);
  12.     IF AStr[H]=aRp Then
  13.     Repeat
  14.      Inc(H);
  15.     until (H>L)or(AStr[H]<>aRp) else inc(H);
  16.   end;
  17. SetLength(AStr, T-1);
  18. end;
  19.  
  20. procedure TForm1.Button1Click(Sender: TObject);
  21. var
  22.   Dirs:String = '\\1\\2\\3\\4\5\6';
  23. begin
  24.  RepeatCharToSingleChar(Dirs,DirectorySeparator);
  25.  Caption := Dirs;
  26. end;
  27.                                              
  28.  
That will remove all repeating chars indicated in the call.
and its done very effectively without using inner functions, deletes etc..

of course this can be enhanced but I think the idea is conveyed
The only true wisdom is knowing you know nothing

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Small speedup for FExpand function
« Reply #6 on: October 01, 2020, 09:08:33 am »
here's a base Idea..

Code: Pascal  [Select][+][-]
  1.  
  2. Procedure RepeatCharToSingleChar(Var AStr:String; aRp:Char);
  3. Var
  4.  T,H,L:Integer;
  5. begin
  6.  T:=1;H:=1;
  7.  L:= Length(AStr);
  8.  While H<=L do
  9.   Begin
  10.     AStr[T]:=AStr[H];
  11.     Inc(T);
  12.     IF AStr[H]=aRp Then
  13.     Repeat
  14.      Inc(H);
  15.     until (H>L)or(AStr[H]<>aRp) else inc(H);
  16.   end;
  17. SetLength(AStr, T-1);
  18. end;
  19.  
  20. procedure TForm1.Button1Click(Sender: TObject);
  21. var
  22.   Dirs:String = '\\1\\2\\3\\4\5\6';
  23. begin
  24.  RepeatCharToSingleChar(Dirs,DirectorySeparator);
  25.  Caption := Dirs;
  26. end;
  27.                                              
  28.  
That will remove all repeating chars indicated in the call.
and its done very effectively without using inner functions, deletes etc..

You should probably check whether T <> H before doing the assignment, as in most cases this will be False and thus the assignment might be more expensive than the check.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #7 on: October 01, 2020, 11:25:58 pm »
You are talking about resetting the length of the string ?

if T <> H then SetLength(…)
?
The only true wisdom is knowing you know nothing

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: Small speedup for FExpand function
« Reply #8 on: October 01, 2020, 11:43:09 pm »
You should probably check whether T <> H before doing the assignment, as in most cases this will be False and thus the assignment might be more expensive than the check.
Or start by finding the first occurrence "pos()" and then go from there.

Also assigning chars to a string, I assume that the code will check for copy-on-write (refcount>1) all the time, despite it is only needed once? (Or does the compiler optimize that?)

So then, if (and only if) you got the first occurrence of the duplicated char, make the string unique, then use pchar.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #9 on: October 02, 2020, 02:09:14 am »
It was a base idea which I use many times to delete specific items in a string without using Delete, POS which just piles up the code and CPU time.

 When performing such operations like this, there is no need to call functions or create another string if all you want to do is remove something in multiple places..

 Using a 2 PChars and incrementing the pointer would also be a speed increase for sure.

 The function isn't needed, I just did that to separate it from the code, it could also be marked "inline" to save on the stack operations.

 Since we are dealing with both Single and Wide chars, I guess an overload version would also need to be made.

The only true wisdom is knowing you know nothing

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #10 on: October 02, 2020, 02:38:40 am »
Code: Pascal  [Select][+][-]
  1. Procedure RepeatToSingleChar(Var Str:String; aRp:Char);  Inline;
  2. var
  3.   T,H,L:Pchar;
  4. Begin
  5.   T := Pchar(Str);
  6.   H := T;
  7.   IF H <> Nil Then
  8.   While (H^<>#0) do
  9.    Begin
  10.      T^ := H^;
  11.      Inc(T);
  12.      If H^=aRp THen
  13.      Repeat
  14.       Inc(H);
  15.      Until (H^=#0)or(H^<>aRp) else Inc(H);
  16.    end;
  17.  IF T<>H then
  18.   SetLength(Str,Length(Str)-(H-T));
  19. end;                                
  20.  
  21.  

This looks good with the 3.0.4 fpc, didn't try it with 3.2 and up but it actually "Inlines" the code so this can actually be used to remove the call to the procedure
The only true wisdom is knowing you know nothing

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: Small speedup for FExpand function
« Reply #11 on: October 02, 2020, 04:11:06 am »
When you find the first duplicate char, you need to call UniqueString.

Access via pchar, does not trigger copy-on-write.
So if
Code: Pascal  [Select][+][-]
  1. var a,b: String;
  2. begin
  3.   a := GetValue;
  4.   b := a; // reference to same memory
  5.   RepeatToSingleChar(a, '/');
  6.  
will modify a and b.
(Except the SetLength applies only to a.)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: Small speedup for FExpand function
« Reply #12 on: October 02, 2020, 04:33:53 am »
Something roughly like this
Code: Pascal  [Select][+][-]
  1.     Procedure RepeatToSingleChar(Var Str:String; aRp:Char);  Inline;
  2.     var
  3.       T,H,L:Pchar;
  4.       i: integer;
  5.     Begin
  6.       T := Pchar(Str);
  7.       if T = nil then
  8.         exit;
  9.  
  10.       while (T^ <> #0) do
  11.         if (T^ <> aRp) then begin
  12.           inc(T);
  13.         end else begin
  14.           inc(T);
  15.           if (T^ = aRp) then break
  16.         end;
  17.       if T^ = #0 then
  18.         exit;
  19.  
  20.       i := T - Pchar(Str);
  21.       UniqueString(Str);
  22.       T := i + Pchar(Str);
  23.       H := T;
  24.  
  25.       Repeat
  26.        Inc(H);
  27.       Until (H^=#0)or(H^<>aRp);
  28.  
  29.       While (H^<>#0) do
  30.        Begin
  31.          T^ := H^;
  32.          Inc(T);
  33.          If H^=aRp THen
  34.            Repeat
  35.             Inc(H);
  36.            Until (H^=#0)or(H^<>aRp)
  37.          else
  38.            Inc(H);
  39.        end;
  40.       SetLength(Str,Length(Str)-(H-T));
  41.     end;
  42.  
  43.  

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Small speedup for FExpand function
« Reply #13 on: October 02, 2020, 01:28:55 pm »
It looks like u didn't even run my code?
U are making a mountain out of a mole hill.

If u pull a string from a constant then u simply use uniquestring once on the string before entering the code.

There is no issue changing contents of a string like I did.
I actually tested the code before posting it.
 8-)
The only true wisdom is knowing you know nothing

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9791
  • Debugger - SynEdit - and more
    • wiki
Re: Small speedup for FExpand function
« Reply #14 on: October 02, 2020, 02:11:25 pm »
I actually tested the code before posting it.

So did I:

Code: Pascal  [Select][+][-]
  1. var
  2.       Dirs:String = 'b\a\\1\\2\\3\\4\5\6789';
  3.       Dirs2:String;
  4.  
  5. begin
  6.     dirs := copy(dirs,1,111);
  7.     Dirs2:=dirs;
  8.      RepeatToSingleChar(Dirs,DirectorySeparator);
  9.      writeln( Dirs);
  10.      writeln( Dirs2);
  11.      readln;
  12. end.
prints
Code: Pascal  [Select][+][-]
  1. b\a\1\2\3\4\5\6789
  2. b\a\1\2\3\4\5\67896789

But "b" should still have the "\\" in it.

(fpc 3.0.4 / but that should not matter)


Quote
If u pull a string from a constant then u simply use uniquestring once on the string before entering the code.
Well, that is what I wrote. => Only that needs to be in the subroutine.

Because then, this needs only happen, if there needs to be any modification at all.
So if the string has no dups, you save the time of copying it around in memory (in case it had more than one ref)


As for mountain vs "mole hill" => I did not classify it.
I simply stated what happens

I posted my own suggestion, independent on it (and yes, it is based on your code.)
« Last Edit: October 02, 2020, 02:15:39 pm by Martin_fr »

 

TinyPortal © 2005-2018