Recent

Author Topic: Contest: fastest IsAnagram function  (Read 9485 times)

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #15 on: October 24, 2024, 10:49:45 am »
sort(a) = sort(b) seems to be the fastest. Given or $20. Can't see what can be faster.
Of course you will prove me wrong.... O:-)
The sort itself does not matter too much on strings below 255. Provided you test length first.
« Last Edit: October 24, 2024, 11:04:01 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #16 on: October 24, 2024, 11:11:10 am »
Using C'sms must be slower, I will have a go.. The sort is good, but even a bubble sort would beat it on short strings...
Aarg, first more coffee.(why do you need the moves?)
Since Bart wanted cross-platform, and no fancy stuff (like calling external functions in  Libs), i remembered, that i did a direct "1:1"-Port of StrSpn to FreePascal.
And StrSpn is, i believe, one of the fastest ways to check for illegal characters.

sort(a) = sort(b) seems to be the fastest. Given or $20. Can't see what can be faster.
Of course you will prove me wrong.... O:-)
The sort itself does not matter too much on strings below 255. Provided you test length first.
Yes, i do test the Length of both Strings first (Line 104)

As i said: Just fooling around. Could admittedly be improved. (Instead of creating the legal Chars in a loop making a Constant etc.)
« Last Edit: October 24, 2024, 11:14:54 am by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #17 on: October 24, 2024, 11:12:57 am »
Zvoni, what do you think I am doing.. 8-)
Mind my very old school trick...
« Last Edit: October 24, 2024, 11:14:44 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #18 on: October 24, 2024, 11:15:35 am »
Zvoni, what do you think I am doing.. 8-)
Mind my very old school trick...
Uhmm....you lost me....
And i admit: WTF is $20? 20 Dollars?
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #19 on: October 24, 2024, 12:09:16 pm »
Orring ascii codes with #32 or $20 end up with the same value: that trick is probably older than me....
You are simply not old enough.. ;) ;)
I thought everybody knew that.
I learned that trick from K&R.
( can also test for bit 6 in an 8 bit char )
« Last Edit: October 24, 2024, 12:20:50 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #20 on: October 24, 2024, 12:30:10 pm »
Orring ascii codes with #32 or $20 end up with the same value: that trick is probably older than me....
You are simply not old enough.. ;) ;)
I thought everybody knew that.
I learned that trick from K&R.
( can also test for bit 6 in an 8 bit char )
Ahhh..... Hex $20....got you.
Duh!
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #21 on: October 24, 2024, 12:34:30 pm »
Zvoni, speed it up, otherwise I really have a go. :-[ %)
If I smell bad code it usually is bad code and that includes my own code.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10553
  • Debugger - SynEdit - and more
    • wiki
Re: Contest: fastest IsAnagram function
« Reply #22 on: October 24, 2024, 12:45:11 pm »
Is there some test data?

Well, I would say the official test data should not be published, so all implementations have to be generic.

- But there should be some sample tests (so everyone can give there times)
- There should be some parameters. E.g the final test data should have
  - long and short test strings (max length = ???? / maybe > 1000 or even > 10k or 100k ?)
  - differ just in one char / differ in many char


And what compiler will be used for final testing / what compiler options ?
« Last Edit: October 24, 2024, 12:52:20 pm by Martin_fr »

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #23 on: October 24, 2024, 12:48:39 pm »
Martin do we have a debugger option for bit 6?
If I smell bad code it usually is bad code and that includes my own code.

Fibonacci

  • Hero Member
  • *****
  • Posts: 604
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #24 on: October 24, 2024, 12:49:48 pm »
@Zvoni: I cant run your code. Could someone check it? It crashes on both trunk and stable 3.2.2.

Code: Text  [Select][+][-]
  1. Project tests raised exception class 'External: ACCESS VIOLATION' with message:
  2. Access violation writing to address $000000010001B0C0.
  3.  
  4.  In file 'tests.lpr' at line 167:
  5. AI[Lo] := AI[Hi];

Or with 32 bits:

Code: Text  [Select][+][-]
  1. Project tests raised exception class 'RunError(204)' with message:
  2. Invalid pointer operation
  3.  
  4.  At address 40983A

Ran the entire code you pasted as is, no modifications; it still crashes.

Code: Text  [Select][+][-]
  1. Project project1 raised exception class 'External: ACCESS VIOLATION' with message:
  2. Access violation reading from address $0000000FFFFFFFF5.
  3.  
  4.  In file 'astrings.inc' at line 154:
  5. If p^.ref<0 then exit;

ALLIGATOR

  • New Member
  • *
  • Posts: 25
Re: Contest: fastest IsAnagram function
« Reply #25 on: October 24, 2024, 01:03:03 pm »
My version:
Code: Pascal  [Select][+][-]
  1. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  2. type
  3.   TFreq = array [32..127] of Int32;
  4. const
  5.   FZero: TFreq = (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  6.                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  7.                   0,0,0,0,0,0,0,0,0,0,0,0);
  8. var
  9.   i: SizeInt;
  10.   F1: TFreq;
  11.   Ch: Byte;
  12. begin
  13.   Result := False;
  14.   FillChar(F1, SizeOf(F1), 0);
  15.  
  16.   i:=0;
  17.   while i<Length(S1) do
  18.   begin
  19.     inc(i);
  20.     Ch:=ord(S1[i]);
  21.     case Ch of
  22.       32..64, 91..122: Inc(F1[Ch]);
  23.       65..90: Inc(F1[Ch or $20]);
  24.       else
  25.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  26.     end;
  27.   end;
  28.  
  29.   i:=0;
  30.   while i<Length(S2) do
  31.   begin
  32.     inc(i);
  33.     Ch:=ord(S2[i]);
  34.     case Ch of
  35.       32..64, 91..122: Dec(F1[Ch]);
  36.       65..90: Dec(F1[Ch or $20]);
  37.       else
  38.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  39.     end;
  40.   end;
  41.  
  42.   if IgnoreSpaces then
  43.   begin
  44.     Result := CompareMem(@F1[Low(F1)+1], @FZero, SizeOf(TFreq)-SizeOf(TFreq[Low(TFreq)]));
  45.   end else
  46.   begin
  47.     Result := CompareMem(@F1, @FZero, SizeOf(TFreq));
  48.   end;
  49. end;
  50.  

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10553
  • Debugger - SynEdit - and more
    • wiki
Re: Contest: fastest IsAnagram function
« Reply #26 on: October 24, 2024, 01:06:57 pm »
Martin do we have a debugger option for bit 6?

FpDebug should take:

byte(x) and $20 = $20

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #27 on: October 24, 2024, 01:24:23 pm »
@Zvoni: I cant run your code. Could someone check it? It crashes on both trunk and stable 3.2.2.
Huh?
Didin't notice.

Agreed, i got an invalid Pointer-Operation. Probably trying to free managed type or something
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Fibonacci

  • Hero Member
  • *****
  • Posts: 604
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #28 on: October 24, 2024, 01:31:04 pm »
Current results, without @Zvoni as his code is producing an AV :D

EDIT: Iterations up to 25 mln

Code: Text  [Select][+][-]
  1. *** ROUND 1 ***
  2. s1 = St a    te
  3. s2 = tas t e
  4.  
  5.                 Bart IgnoreSpaces ExceptionOnError | 1547 ms | result 25000000
  6.            Fibonacci IgnoreSpaces ExceptionOnError |  765 ms | result 25000000
  7.               ASerge IgnoreSpaces ExceptionOnError |  954 ms | result 25000000
  8.            ALLIGATOR IgnoreSpaces ExceptionOnError |  765 ms | result 25000000
  9.                                  Bart IgnoreSpaces | 1485 ms | result 25000000
  10.                             Fibonacci IgnoreSpaces | 1156 ms | result 25000000
  11.                                ASerge IgnoreSpaces | 1140 ms | result 25000000
  12.                             ALLIGATOR IgnoreSpaces | 1469 ms | result 25000000
  13.                                               Bart | 1172 ms | result 0
  14.                                          Fibonacci |  641 ms | result 0
  15.                                             ASerge |  750 ms | result 0
  16.                                          ALLIGATOR |  562 ms | result 0
  17.  
  18. *** ROUND 2 ***
  19. s1 = night
  20. s2 = THING
  21.  
  22.                 Bart IgnoreSpaces ExceptionOnError | 1281 ms | result 25000000
  23.            Fibonacci IgnoreSpaces ExceptionOnError |  891 ms | result 25000000
  24.               ASerge IgnoreSpaces ExceptionOnError | 1047 ms | result 25000000
  25.            ALLIGATOR IgnoreSpaces ExceptionOnError | 1000 ms | result 25000000
  26.                                  Bart IgnoreSpaces | 1672 ms | result 25000000
  27.                             Fibonacci IgnoreSpaces |  547 ms | result 25000000
  28.                                ASerge IgnoreSpaces |  828 ms | result 25000000
  29.                             ALLIGATOR IgnoreSpaces |  625 ms | result 25000000
  30.                                               Bart | 1203 ms | result 25000000
  31.                                          Fibonacci |  547 ms | result 25000000
  32.                                             ASerge |  828 ms | result 25000000
  33.                                          ALLIGATOR | 1016 ms | result 25000000
  34.  
  35. *** ROUND 3: Invalid chars ***
  36. s1 = Invalid
  37. s2 = Diff length
  38.  
  39.                 Bart IgnoreSpaces ExceptionOnError | 2437 ms | result 0
  40.            Fibonacci IgnoreSpaces ExceptionOnError | 1547 ms | result 0
  41.               ASerge IgnoreSpaces ExceptionOnError | 1000 ms | result 0
  42.            ALLIGATOR IgnoreSpaces ExceptionOnError |  656 ms | result 0
  43.                                  Bart IgnoreSpaces | 1390 ms | result 0
  44.                             Fibonacci IgnoreSpaces | 1282 ms | result 0
  45.                                ASerge IgnoreSpaces | 1484 ms | result 0
  46.                             ALLIGATOR IgnoreSpaces | 1516 ms | result 0
  47.                                               Bart | 2172 ms | result 0
  48.                                          Fibonacci |  718 ms | result 0
  49.                                             ASerge |  844 ms | result 0
  50.                                          ALLIGATOR |  656 ms | result 0

Code: Pascal  [Select][+][-]
  1.   procedure TestAll;
  2.   begin
  3.     Test('Bart', @IsAnagram, true, true);
  4.     Test('Fibonacci', @IsAnagram_fibo, true, true);
  5.     Test('ASerge', @IsAnagramASerge, true, true);
  6.     //Test('Zvoni', @IsAnagram_Zvoni, true, true);
  7.     Test('ALLIGATOR', @IsAnagram_ALLIGATOR, true, true);
  8.  
  9.     Test('Bart', @IsAnagram, true, false);
  10.     Test('Fibonacci', @IsAnagram_fibo, true, false);
  11.     Test('ASerge', @IsAnagramASerge, true, false);
  12.     //Test('Zvoni', @IsAnagram_Zvoni, true, false);
  13.     Test('ALLIGATOR', @IsAnagram_ALLIGATOR, true, false);
  14.  
  15.     Test('Bart', @IsAnagram, false, false);
  16.     Test('Fibonacci', @IsAnagram_fibo, false, false);
  17.     Test('ASerge', @IsAnagramASerge, false, false);
  18.     //Test('Zvoni', @IsAnagram_Zvoni, false, false);
  19.     Test('ALLIGATOR', @IsAnagram_ALLIGATOR, false, false);
  20.   end;
« Last Edit: October 24, 2024, 01:37:38 pm by Fibonacci »

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #29 on: October 24, 2024, 01:36:03 pm »
Found it. Something went wrong with the 2 Move-Calls (No Idea)

Adjusted.

You could probably nest the two additinal Procedures/Functions
Code: Pascal  [Select][+][-]
  1. program project1;
  2. {$mode objfpc}{$H+}
  3. Uses Sysutils;
  4.  
  5. Var
  6.   os1,os2:String;
  7.   b:Boolean;
  8.  
  9. procedure QuickSort(var AI: array of Char; ALo, AHi: Integer);
  10. var
  11.   Pivot,T: Char;
  12.   Lo, Hi:Integer;
  13. begin
  14.   Lo := ALo;
  15.   Hi := AHi;
  16.   Pivot := AI[(Lo + Hi) div 2];
  17.   repeat
  18.     while AI[Lo] < Pivot do
  19.       Inc(Lo) ;
  20.     while AI[Hi] > Pivot do
  21.       Dec(Hi) ;
  22.     if Lo <= Hi then
  23.     begin
  24.       T := AI[Lo];
  25.       AI[Lo] := AI[Hi];
  26.       AI[Hi] := T;
  27.       Inc(Lo) ;
  28.       Dec(Hi) ;
  29.     end;
  30.   until Lo > Hi;
  31.   if Hi > ALo then
  32.     QuickSort(AI, ALo, Hi) ;
  33.   if Lo < AHi then
  34.     QuickSort(AI, Lo, AHi) ;
  35. end;
  36. Function StrSpn(Const str:PChar;Const Accept:PChar):Integer;
  37. Var
  38.   a:PChar;
  39.   table:Array[0..255] Of Byte;
  40.   p:PByte;
  41.   c0,c1,c2,c3:ByteBool;
  42.   s:PChar;
  43.   Count:Integer;
  44. Begin
  45.   If Accept[0]=#0 Then Exit(0);
  46.   If Accept[1]=#0 Then
  47.     Begin
  48.       a:=str;
  49.       While a^=accept^ Do Inc(a);
  50.       Exit(a-str);
  51.     end;
  52.   FillChar(table,64,0);
  53.   p:=@table[0];
  54.   FillChar(table[64],64,0);
  55.   FillChar(table[128],64,0);
  56.   FillChar(table[192],64,0);
  57.   s:=accept;
  58.   While s^<>#0 Do
  59.     Begin
  60.       p[Byte(s^)]:=1;
  61.       Inc(s);
  62.     end;
  63.   s:=str;
  64.   If Not ByteBool(p[Byte(s[0])]) Then Exit(0);
  65.   If Not ByteBool(p[Byte(s[1])]) Then Exit(1);
  66.   If Not ByteBool(p[Byte(s[2])]) Then Exit(2);
  67.   If Not ByteBool(p[Byte(s[3])]) Then Exit(3);
  68.   Repeat
  69.     Inc(s,4);
  70.     c0:=ByteBool(p[Byte(s[0])]);
  71.     c1:=ByteBool(p[Byte(s[1])]);
  72.     c2:=byteBool(p[Byte(s[2])]);
  73.     c3:=ByteBool(p[Byte(s[3])]);
  74.   until Not (c0 And C1 And C2 And C3);
  75.   Count:=s-str;
  76.   If Not (c0 And c1) Then
  77.     Result:=count+Byte(c0)
  78.   Else
  79.     Result:=Count+Byte(c2)+2;
  80. End;
  81.  
  82. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  83. Var
  84.   l1,l2:Integer;
  85.   ps1,ps2:Array Of AnsiChar;
  86.   i:Integer;
  87.   p1,p2:PChar;
  88.   by:Byte;
  89.   c:Array[32..127] Of Char;    //Legal Characters
  90. Begin
  91.   Result:=False;
  92.   For by:=32 To 127 Do c[by]:=Char(by);
  93.   If IgnoreSpaces Then
  94.     Begin
  95.       p1:=PChar(StringReplace(S1,' ','',[rfReplaceAll]));
  96.       p2:=PChar(StringReplace(S2,' ','',[rfReplaceAll]));
  97.     end
  98.   Else
  99.     Begin
  100.       p1:=PChar(S1);
  101.       p2:=PChar(S2);
  102.     end;
  103.   l1:=Length(strpas(p1));
  104.   l2:=Length(strpas(p2));
  105.   If l1<>l2 Then Exit;  //unequal Length.
  106.   //We only step into this code if l1=l2
  107.   i:=StrSpn(p1,PChar(@c[32]));
  108.   If i<>l1 Then Exit; //Illegal char
  109.   i:=StrSpn(p2,PChar(@c[32]));
  110.   If i<>l2 Then Exit; //Illegal char
  111.   SetLength(ps1,l1);
  112.   SetLength(ps2,l2);
  113.   For i:=0 To l1-1 Do
  114.     Begin
  115.       ps1[i]:=p1^;
  116.       Inc(p1);
  117.       ps2[i]:=p2^;
  118.       Inc(p2);
  119.     end;
  120.   QuickSort(ps1,Low(ps1),High(ps1));
  121.   QuickSort(ps2,Low(ps2),High(ps2));
  122.   Result:=CompareMem(@ps1[0],@ps2[0],Length(ps1));
  123. End;
  124.  
  125. begin
  126.   os1:='TOM MARVOLO RIDDLE';
  127.   os2:='I AM LORD VOLDEMORT';
  128.   b:=IsAnagram(os1,os2);
  129.   Writeln(b);
  130. end.
« Last Edit: October 24, 2024, 01:43:08 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

 

TinyPortal © 2005-2018