Recent

Author Topic: Contest: fastest IsAnagram function  (Read 9113 times)

avk

  • Hero Member
  • *****
  • Posts: 769
Re: Contest: fastest IsAnagram function
« Reply #165 on: October 31, 2024, 11:20:34 am »
...
Code: [Select]
Testing speed
Bart           :   375
Bart2          :   344
Warfly         :   312
Fibonacci      : Failed validity test
ASerge         :   250
Zvoni          : Failed validity test
Zvoni2         : Failed validity test
Alligator      : Failed validity test
SilverCoder    : Failed validity test
AVK            :   109
Paweld         :   266
BrunoK         :   281
Delphius       :   234
Benibela       :   188
Josh           :   390
Nek            : Failed validity test
Martin         : Failed validity test
Dummy          : Failed validity test

Bart

Seems like at least the latest(#124) silvercoder70's version should pass the validity check.

Also funny test case(invalid anagram):
Code: Pascal  [Select][+][-]
  1.   S1 := S1.Create('a', 256);
  2.   S2 := S2.Create('b', 256);
  3.  


Bart

  • Hero Member
  • *****
  • Posts: 5465
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #166 on: October 31, 2024, 06:03:28 pm »
Seems like at least the latest(#124) silvercoder70's version should pass the validity check.

Indeed.

Code: [Select]
Testing speed
Bart           :   375
Bart2          :   313
Warfly         :   297
Fibonacci      : Failed validity test
ASerge         :   234
Zvoni          : Failed validity test
Zvoni2         : Failed validity test
Alligator      : Failed validity test
SilverCoder    :   265
AVK            :   125
Paweld         :   250
BrunoK         :   235
Delphius       :   234
Benibela       :   156
Josh           :   391
Nek            : Failed validity test
Martin         : Failed validity test
Dummy          : Failed validity test

Bart

ALLIGATOR

  • New Member
  • *
  • Posts: 24
Re: Contest: fastest IsAnagram function
« Reply #167 on: November 01, 2024, 03:56:14 am »
Just updating my code so it doesn't fail tests:

Code: Pascal  [Select][+][-]
  1. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  2. type
  3.   TFreq = array [32..127+1] of Int16;
  4. const
  5.   FZero: TFreq = (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  6.                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  7.                   0,0,0,0,0,0,0,0,0,0,0,0,0
  8.                   );
  9. var
  10.   i: SizeInt;
  11.   freq: TFreq;
  12.   ch: uint8;
  13. begin
  14.   Result := False;
  15.   FillQWord(freq, SizeOf(freq) div 8, 0);
  16.  
  17.   i:=0;
  18.   while i<Length(S1) do
  19.   begin
  20.     inc(i);
  21.     ch:=ord(S1[i]);
  22.  
  23.     case ch of
  24.       32..64, 91..127: Inc(freq[ch]);
  25.       65..90: Inc(freq[ch or $20]);
  26.       else
  27.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  28.     end;
  29.   end;
  30.  
  31.   i:=0;
  32.   while i<Length(S2) do
  33.   begin
  34.     inc(i);
  35.     ch:=ord(S2[i]);
  36.  
  37.     case ch of
  38.       32..64, 91..127: Dec(freq[ch]);
  39.       65..90: Dec(freq[ch or $20]);
  40.       else
  41.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  42.     end;
  43.   end;
  44.  
  45.   if IgnoreSpaces then
  46.   begin
  47.     Result := CompareDWord(freq[Low(freq)+1], FZero[Low(FZero)], SizeOf(TFreq) div 4)=0;
  48.   end else
  49.   begin
  50.     Result := CompareDWord(freq[Low(freq)], FZero[Low(FZero)], SizeOf(TFreq) div 4)=0;
  51.   end;
  52. end;
  53.  

Bart

  • Hero Member
  • *****
  • Posts: 5465
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #168 on: November 01, 2024, 06:12:52 pm »
Just updating my code so it doesn't fail tests:

Alas:
Code: [Select]
TestValidity for Alligator
FAIL: valid anagram rejected (with IgnoreSpaces=TRUE):
S1: "1234567890"
S2: "0 1 2 3 4 5 6 7 8 9 "
Validitycheck FAIL for Alligator

Bart

ALLIGATOR

  • New Member
  • *
  • Posts: 24
Re: Contest: fastest IsAnagram function
« Reply #169 on: November 01, 2024, 08:08:22 pm »
Hmmm... please take a look, I took the test from some of the first pages of the topic and I'm testing on it....
I also took the avk algorithm and it seems that on the test data you provided - the results are similar...

FPC trunk, Win64

Code: Pascal  [Select][+][-]
  1. program app;
  2.  
  3. // on WINDOWS use QueryPerformanceCounter
  4. {$ifdef WINDOWS}
  5. uses SysUtils, Windows, math;
  6.  
  7. function GetTickCount64: QWord;
  8. const
  9.   freq: int64 = 0;
  10. begin
  11.   if freq = 0 then QueryPerformanceFrequency(@freq);
  12.   QueryPerformanceCounter(@result);
  13.   result := trunc((result / freq) * 1000);
  14. end;
  15. {$else}
  16. uses SysUtils;
  17. {$endif}
  18.  
  19. function IsAnagramALLIGATOR(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  20. type
  21.   TFreq = array [32..127+1] of Int16;
  22. const
  23.   FZero: TFreq = (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  24.                   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  25.                   0,0,0,0,0,0,0,0,0,0,0,0,0
  26.                   );
  27. var
  28.   i: SizeInt;
  29.   freq: TFreq;
  30.   ch: uint8;
  31. begin
  32.   Result := False;
  33.   FillQWord(freq, SizeOf(freq) div 8, 0);
  34.  
  35.   i:=0;
  36.   while i<Length(S1) do
  37.   begin
  38.     inc(i);
  39.     ch:=ord(S1[i]);
  40.  
  41.     case ch of
  42.       32..64, 91..127: Inc(freq[ch]);
  43.       65..90: Inc(freq[ch or $20]);
  44.       else
  45.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  46.     end;
  47.   end;
  48.  
  49.   i:=0;
  50.   while i<Length(S2) do
  51.   begin
  52.     inc(i);
  53.     ch:=ord(S2[i]);
  54.  
  55.     case ch of
  56.       32..64, 91..127: Dec(freq[ch]);
  57.       65..90: Dec(freq[ch or $20]);
  58.       else
  59.         if ExceptionOnError then Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  60.     end;
  61.   end;
  62.  
  63.   if IgnoreSpaces then
  64.   begin
  65.     Result := CompareDWord(freq[Low(freq)+1], FZero[Low(FZero)], SizeOf(TFreq) div 4)=0;
  66.   end else
  67.   begin
  68.     Result := CompareDWord(freq[Low(freq)], FZero[Low(FZero)], SizeOf(TFreq) div 4)=0;
  69.   end;
  70. end;
  71.  
  72. function IsAnagram_avk(const s1, s2: string; aIgnoreSpaces: Boolean = True; aExceptionOnError: Boolean = False): Boolean;
  73. type
  74. {$IFDEF CPU64}
  75.   TChunk = QWord;
  76. {$ELSE}
  77.   TChunk = DWord;
  78. {$ENDIF}
  79.   PChunk = ^TChunk;
  80. const
  81. {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  82.   {$IFDEF CPU64}
  83.   MASK = 7;
  84.   {$ELSE}
  85.   MASK = 3;
  86.   {$ENDIF}
  87. {$ENDIF}
  88. {$IFDEF CPU64}
  89.   BITS5 = QWord($2020202020202020);
  90.   BITS6 = QWord($4040404040404040);
  91.   BITS7 = QWord($8080808080808080);
  92. {$ELSE}
  93.   BITS5 = DWord($20202020);
  94.   BITS6 = DWord($40404040);
  95.   BITS7 = DWord($80808080);
  96. {$ENDIF}
  97.   ERR_FMT = 'Illegal character in %s, position %d(#%d)';
  98.  
  99.   function Invalid(aOfs: Integer; aFirstArg: Boolean): Boolean;
  100.   begin
  101.     Invalid := False;
  102.     if aExceptionOnError then
  103.       if aFirstArg then
  104.         raise ERangeError.CreateFmt(ERR_FMT, ['s1', aOfs, Ord(s1[aOfs])])
  105.       else
  106.         raise ERangeError.CreateFmt(ERR_FMT, ['s2', aOfs, Ord(s2[aOfs])])
  107.   end;
  108.  
  109.   function InvalidUp(aFlags: TChunk; aOfs: Integer; aFirstArg: Boolean): Boolean;
  110.   begin
  111.     {$IFDEF ENDIAN_BIG}aFlags := SwapEndian(aFlags);{$ENDIF}
  112.     Inc(aOfs,{$IFDEF CPU64}BsfQWord{$ELSE}BsfDWord{$ENDIF}(aFlags) div 8);
  113.     Result := Invalid(aOfs, aFirstArg);
  114.   end;
  115.  
  116.   function InvalidLo(aFlags: TChunk; aOfs: Integer; aFirstArg: Boolean): Boolean;
  117.   begin
  118.     {$IFDEF ENDIAN_BIG}aFlags := SwapEndian(aFlags);{$ENDIF}
  119.     Inc(aOfs,{$IFDEF CPU64}BsfQWord{$ELSE}BsfDWord{$ENDIF}(aFlags xor BITS5) div 8);
  120.     Result := Invalid(aOfs, aFirstArg);
  121.   end;
  122.  
  123. type
  124. {$IFDEF CPU64}
  125.   TShortBuffer = array[0..8] of QWord;
  126. {$ELSE}
  127.   TShortBuffer = array[0..17] of DWord;
  128. {$ENDIF}
  129. const
  130. {$PUSH}{$J-}
  131. {$IFDEF CPU64}
  132.   ZERO_BUF: TShortBuffer = (0,0,0,0,0,0,0,0,0);
  133. {$ELSE}
  134.   ZERO_BUF: TShortBuffer = (0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);
  135. {$ENDIF}
  136.   CI_MAP: array[32..127] of Byte = (
  137.     32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
  138.     52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
  139.     72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91,
  140.     92, 93, 94, 95, 96, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
  141.     80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99, 100, 101
  142.   );
  143. {$POP}
  144.   function TestShort(const s1, s2: string): Boolean;
  145.   var
  146.     Buf: TShortBuffer;
  147.     Counter: array[32..101] of ShortInt absolute Buf;
  148.     p, pEnd: PByte;
  149.   begin
  150.     Buf := ZERO_BUF;
  151.  
  152.     p := PByte(s1);
  153.     PEnd := p + Length(s1);
  154.   {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  155.     while (p < pEnd) and (SizeUInt(p) and MASK <> 0) do begin
  156.       if p^ in [32..127] then Inc(Counter[CI_MAP[p^]])
  157.       else exit(Invalid(Succ(p - PByte(s1)), True));
  158.       Inc(p);
  159.     end;
  160.   {$ENDIF}
  161.     while p < pEnd - SizeOf(TChunk) do begin
  162.       if PChunk(p)^ and BITS7 <> 0 then
  163.         exit(InvalidUp(PChunk(p)^ and BITS7, p-PByte(s1)+1, True));
  164.       if (PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1) < BITS5 then
  165.         exit(InvalidLo((PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1), p-PByte(s1)+1, True));
  166.       Inc(Counter[CI_MAP[p[0]]]); Inc(Counter[CI_MAP[p[1]]]);
  167.       Inc(Counter[CI_MAP[p[2]]]); Inc(Counter[CI_MAP[p[3]]]);
  168.     {$IFDEF CPU64}
  169.       Inc(Counter[CI_MAP[p[4]]]); Inc(Counter[CI_MAP[p[5]]]);
  170.       Inc(Counter[CI_MAP[p[6]]]); Inc(Counter[CI_MAP[p[7]]]);
  171.     {$ENDIF}
  172.       Inc(p, SizeOf(TChunk));
  173.     end;
  174.     while p < pEnd do begin
  175.       if p^ in [32..127] then Inc(Counter[CI_MAP[p^]])
  176.       else exit(Invalid(Succ(p - PByte(s1)), True));
  177.       Inc(p);
  178.     end;
  179.  
  180.     p := PByte(s2);
  181.     PEnd := p + Length(s2);
  182.   {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  183.     while (p < pEnd) and (SizeUInt(p) and MASK <> 0) do begin
  184.       if p^ in [32..127] then Dec(Counter[CI_MAP[p^]])
  185.       else exit(Invalid(Succ(p - PByte(s2)), False));
  186.       Inc(p);
  187.     end;
  188.   {$ENDIF}
  189.     while p < pEnd - SizeOf(TChunk) do begin
  190.       if PChunk(p)^ and BITS7 <> 0 then
  191.         exit(InvalidUp(PChunk(p)^ and BITS7, p-PByte(s2)+1, False));
  192.       if (PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1) < BITS5 then
  193.         exit(InvalidLo((PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1), p-PByte(s2)+1, False));
  194.       Dec(Counter[CI_MAP[p[0]]]); Dec(Counter[CI_MAP[p[1]]]);
  195.       Dec(Counter[CI_MAP[p[2]]]); Dec(Counter[CI_MAP[p[3]]]);
  196.     {$IFDEF CPU64}
  197.       Dec(Counter[CI_MAP[p[4]]]); Dec(Counter[CI_MAP[p[5]]]);
  198.       Dec(Counter[CI_MAP[p[6]]]); Dec(Counter[CI_MAP[p[7]]]);
  199.     {$ENDIF}
  200.       Inc(p, SizeOf(TChunk));
  201.     end;
  202.     while p < pEnd do begin
  203.       if p^ in [32..127] then Dec(Counter[CI_MAP[p^]])
  204.       else exit(Invalid(Succ(p - PByte(s2)), False));
  205.       Inc(p);
  206.     end;
  207.  
  208.     if aIgnoreSpaces then Counter[32] := 0;
  209.   {$IFDEF CPU64}
  210.     if Buf[0] or Buf[1] or Buf[2] or Buf[3] or Buf[4] or
  211.        Buf[5] or Buf[6] or Buf[7] or Buf[8] <> 0 then exit(False);
  212.   {$ELSE}
  213.     if Buf[0] or Buf[1] or Buf[2] or Buf[3] or Buf[4] or Buf[5] or Buf[6] or Buf[7] or
  214.        Buf[8] or Buf[9] or Buf[10] or Buf[11] or Buf[12] or Buf[13] or Buf[14] or
  215.        Buf[15] or Buf[16] or Buf[17] <> 0 then exit(False);
  216.   {$ENDIF}
  217.     Result := True;
  218.   end;
  219.  
  220. const
  221.   SHORT  = 126;
  222. var
  223.   Counter: array[32..127] of Integer;
  224.   p, pEnd: PByte;
  225.   I: Integer;
  226. begin
  227.   if (s1 = '') or (s2 = '') then
  228.     exit(False);
  229.   if not aIgnoreSpaces and not aExceptionOnError and (Length(s1) <> Length(s2)) then
  230.     exit(False);
  231.   if Math.Max(Length(s1), Length(s2)) <= SHORT then
  232.     exit(TestShort(s1, s2));
  233.  
  234.   FillChar(Counter, SizeOf(Counter), 0);
  235.  
  236.   p := PByte(s1);
  237.   PEnd := p + Length(s1);
  238. {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  239.   while (p < pEnd) and (SizeUInt(p) and MASK <> 0) do begin
  240.     if p^ in [32..127] then Inc(Counter[p^])
  241.     else exit(Invalid(Succ(p - PByte(s1)), True));
  242.     Inc(p);
  243.   end;
  244. {$ENDIF}
  245.   while p < pEnd - SizeOf(TChunk) do begin
  246.     if PChunk(p)^ and BITS7 <> 0 then
  247.       exit(InvalidUp(PChunk(p)^ and BITS7, p-PByte(s1)+1, True));
  248.     if (PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1) < BITS5 then
  249.       exit(InvalidLo((PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1), p-PByte(s1)+1, True));
  250.     Inc(Counter[p[0]]); Inc(Counter[p[1]]);
  251.     Inc(Counter[p[2]]); Inc(Counter[p[3]]);
  252.   {$IFDEF CPU64}
  253.     Inc(Counter[p[4]]); Inc(Counter[p[5]]);
  254.     Inc(Counter[p[6]]); Inc(Counter[p[7]]);
  255.   {$ENDIF}
  256.     Inc(p, SizeOf(TChunk));
  257.   end;
  258.   while p < pEnd do begin
  259.     if p^ in [32..127] then Inc(Counter[p^])
  260.     else exit(Invalid(Succ(p - PByte(s1)), True));
  261.     Inc(p);
  262.   end;
  263.   for I := 97 to 122 do
  264.     if Counter[I] <> 0 then begin
  265.       Inc(Counter[I-32], Counter[I]);
  266.       Counter[I] := 0;
  267.     end;
  268.  
  269.   p := PByte(s2);
  270.   PEnd := p + Length(s2);
  271. {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  272.   while (p < pEnd) and (SizeUInt(p) and MASK <> 0) do begin
  273.     if p^ in [32..127] then Dec(Counter[p^])
  274.     else exit(Invalid(Succ(p - PByte(s2)), False));
  275.     Inc(p);
  276.   end;
  277. {$ENDIF}
  278.   while p < pEnd - SizeOf(TChunk) do begin
  279.     if PChunk(p)^ and BITS7 <> 0 then
  280.       exit(InvalidUp(PChunk(p)^ and BITS7, p-PByte(s2)+1, False));
  281.     if (PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1) < BITS5 then
  282.       exit(InvalidLo((PChunk(p)^ and BITS5) or ((PChunk(p)^ and BITS6) shr 1), p-PByte(s2)+1, False));
  283.     Dec(Counter[p[0]]); Dec(Counter[p[1]]);
  284.     Dec(Counter[p[2]]); Dec(Counter[p[3]]);
  285.   {$IFDEF CPU64}
  286.     Dec(Counter[p[4]]); Dec(Counter[p[5]]);
  287.     Dec(Counter[p[6]]); Dec(Counter[p[7]]);
  288.   {$ENDIF}
  289.     Inc(p, SizeOf(TChunk));
  290.   end;
  291.   while p < pEnd do begin
  292.     if p^ in [32..127] then Dec(Counter[p^])
  293.     else exit(Invalid(Succ(p - PByte(s2)), False));
  294.     Inc(p);
  295.   end;
  296.   for I := 97 to 122 do
  297.     if Counter[I] <> 0 then
  298.       Inc(Counter[I-32], Counter[I]);
  299.  
  300.   if aIgnoreSpaces then Counter[32] := 0;
  301.   I := 32;
  302.   while I < 95 do begin
  303.     if Counter[I] or Counter[I+1] or Counter[I+2] or Counter[I+3] <> 0 then
  304.       exit(False);
  305.     Inc(I, 4);
  306.   end;
  307.   if Counter[96] or Counter[123] or Counter[124] or Counter[125] or Counter[126] or Counter[127] <> 0 then
  308.     exit(False);
  309.   Result := True;
  310. end;
  311.  
  312. procedure main;
  313. const
  314.   ITERATIONS = 1000*1000*5;
  315. var
  316.   i, c: integer;
  317.   s, d: string;
  318.   u: ptruint;
  319. begin
  320. //{$IFDEF kaka}
  321.  
  322.   s:='1234567890';
  323.   d:='0 1 2 3 4 5 6 7 8 9 ';
  324.  
  325.   //s := 'St a    te';
  326.   //d := 'tas t e';
  327.   writeln('*** ROUND 1 ***');
  328.   writeln('s1 = ', s);
  329.   writeln('s2 = ', d);
  330.   writeln;
  331.  
  332.   // test with invalid characters
  333.   //s += #1; d += #2;
  334.  
  335.   // -------------------------------------
  336.   write('ALLIGATOR: IgnoreSpaces + ExceptionOnError':50);
  337.   c := 0;
  338.   u := GetTickCount64;
  339.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, true) then c += 1;
  340.   write(' | ', (GetTickCount64-u):4, ' ms');
  341.   write(' | result ', c);
  342.   writeln;
  343.   // -------------------------------------
  344.   write('avk: IgnoreSpaces + ExceptionOnError':50);
  345.   c := 0;
  346.   u := GetTickCount64;
  347.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, true) then c += 1;
  348.   write(' | ', (GetTickCount64-u):4, ' ms');
  349.   write(' | result ', c);
  350.   writeln;
  351.   // -------------------------------------
  352.   write('ALLIGATOR: IgnoreSpaces':50);
  353.   c := 0;
  354.   u := GetTickCount64;
  355.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, false) then c += 1;
  356.   write(' | ', (GetTickCount64-u):4, ' ms');
  357.   write(' | result ', c);
  358.   writeln;
  359.   // -------------------------------------
  360.   write('avk: IgnoreSpaces':50);
  361.   c := 0;
  362.   u := GetTickCount64;
  363.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, false) then c += 1;
  364.   write(' | ', (GetTickCount64-u):4, ' ms');
  365.   write(' | result ', c);
  366.   writeln;
  367.   // -------------------------------------
  368.   write('ALLIGATOR':50);
  369.   c := 0;
  370.   u := GetTickCount64;
  371.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, false, false) then c += 1;
  372.   write(' | ', (GetTickCount64-u):4, ' ms');
  373.   write(' | result ', c);
  374.   writeln;
  375.   // -------------------------------------
  376.   write('avk':50);
  377.   c := 0;
  378.   u := GetTickCount64;
  379.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, false, false) then c += 1;
  380.   write(' | ', (GetTickCount64-u):4, ' ms');
  381.   write(' | result ', c);
  382.   writeln;
  383.   // -------------------------------------
  384.  
  385.   writeln;
  386.  
  387.   //s := 'night';
  388.   //d := 'THING';
  389.   writeln('*** ROUND 2 ***');
  390.   writeln('s1 = ', s);
  391.   writeln('s2 = ', d);
  392.   writeln;
  393.  
  394.   // -------------------------------------
  395.   write('ALLIGATOR: IgnoreSpaces + ExceptionOnError':50);
  396.   c := 0;
  397.   u := GetTickCount64;
  398.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, true) then c += 1;
  399.   write(' | ', (GetTickCount64-u):4, ' ms');
  400.   write(' | result ', c);
  401.   writeln;
  402.   // -------------------------------------
  403.   write('avk: IgnoreSpaces + ExceptionOnError':50);
  404.   c := 0;
  405.   u := GetTickCount64;
  406.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, true) then c += 1;
  407.   write(' | ', (GetTickCount64-u):4, ' ms');
  408.   write(' | result ', c);
  409.   writeln;
  410.   // -------------------------------------
  411.   write('ALLIGATOR: IgnoreSpaces':50);
  412.   c := 0;
  413.   u := GetTickCount64;
  414.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, false) then c += 1;
  415.   write(' | ', (GetTickCount64-u):4, ' ms');
  416.   write(' | result ', c);
  417.   writeln;
  418.   // -------------------------------------
  419.   write('avk: IgnoreSpaces':50);
  420.   c := 0;
  421.   u := GetTickCount64;
  422.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, false) then c += 1;
  423.   write(' | ', (GetTickCount64-u):4, ' ms');
  424.   write(' | result ', c);
  425.   writeln;
  426.   // -------------------------------------
  427.   write('ALLIGATOR':50);
  428.   c := 0;
  429.   u := GetTickCount64;
  430.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, false, false) then c += 1;
  431.   write(' | ', (GetTickCount64-u):4, ' ms');
  432.   write(' | result ', c);
  433.   writeln;
  434.   // -------------------------------------
  435.   write('avk':50);
  436.   c := 0;
  437.   u := GetTickCount64;
  438.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, false, false) then c += 1;
  439.   write(' | ', (GetTickCount64-u):4, ' ms');
  440.   write(' | result ', c);
  441.   writeln;
  442.   // -------------------------------------
  443.  
  444.   writeln;
  445. //{$ENDIF}
  446.  
  447.   //s := 'Invalid';
  448.   //d := 'Diff length';
  449.   writeln('*** ROUND 3: Invalid chars ***');
  450.   writeln('s1 = ', s);
  451.   writeln('s2 = ', d);
  452.   writeln;
  453.  
  454.   // -------------------------------------
  455.   write('ALLIGATOR: IgnoreSpaces + ExceptionOnError':50);
  456.   c := 0;
  457.   u := GetTickCount64;
  458.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, true) then c += 1;
  459.   write(' | ', (GetTickCount64-u):4, ' ms');
  460.   write(' | result ', c);
  461.   writeln;
  462.   // -------------------------------------
  463.   write('avk: IgnoreSpaces + ExceptionOnError':50);
  464.   c := 0;
  465.   u := GetTickCount64;
  466.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, true) then c += 1;
  467.   write(' | ', (GetTickCount64-u):4, ' ms');
  468.   write(' | result ', c);
  469.   writeln;
  470.   // -------------------------------------
  471.   write('ALLIGATOR: IgnoreSpaces':50);
  472.   c := 0;
  473.   u := GetTickCount64;
  474.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, true, false) then c += 1;
  475.   write(' | ', (GetTickCount64-u):4, ' ms');
  476.   write(' | result ', c);
  477.   writeln;
  478.   // -------------------------------------
  479.   write('avk: IgnoreSpaces':50);
  480.   c := 0;
  481.   u := GetTickCount64;
  482.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, true, false) then c += 1;
  483.   write(' | ', (GetTickCount64-u):4, ' ms');
  484.   write(' | result ', c);
  485.   writeln;
  486.   // -------------------------------------
  487.   write('ALLIGATOR':50);
  488.   c := 0;
  489.   u := GetTickCount64;
  490.   for i := 1 to ITERATIONS do if IsAnagramALLIGATOR(s, d, false, false) then c += 1;
  491.   write(' | ', (GetTickCount64-u):4, ' ms');
  492.   write(' | result ', c);
  493.   writeln;
  494.   // -------------------------------------
  495.   write('avk':50);
  496.   c := 0;
  497.   u := GetTickCount64;
  498.   for i := 1 to ITERATIONS do if IsAnagram_avk(s, d, false, false) then c += 1;
  499.   write(' | ', (GetTickCount64-u):4, ' ms');
  500.   write(' | result ', c);
  501.   writeln;
  502.   // -------------------------------------
  503.  
  504.   readln;
  505. end;
  506.  
  507. begin
  508.   main;
  509. end.
  510.  

avk

  • Hero Member
  • *****
  • Posts: 769
Re: Contest: fastest IsAnagram function
« Reply #170 on: November 01, 2024, 08:56:09 pm »
Somewhere closer to the beginning of the thread there are examples of test anagrams and code used for validity check.

But even without them you can see:
Line 33: you zero out 96 elements of the freq array and are left with garbage(if any) in 97.
Line 45,59: if ExceptionOnError is True, an exception is raised. If not, execution just continues, ignoring the illegal character.
« Last Edit: November 01, 2024, 08:59:17 pm by avk »

BrunoK

  • Hero Member
  • *****
  • Posts: 623
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #171 on: November 01, 2024, 09:11:50 pm »
Hmmm... please take a look, I took the test from some of the first pages of the topic and I'm testing on it....
I also took the avk algorithm and it seems that on the test data you provided - the results are similar...

In my tests,
Code: Text  [Select][+][-]
  1. *** ROUND 5: Long invalid chars ***
  2. s1 = !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~âO'
  3. s2 = âO'~}|{zyxw vutsrqpon mlkjihgfe dcba`_^]\ [ZYXWVUTS RQPONMLKJ IHGFEDCBA @?>=<;:98 76543210/ .-,+*)('& %$#"!
  4.                                  Bart IgnoreSpaces |  12.054 ms | result 0
  5.                                ASerge IgnoreSpaces |   7.857 ms | result 0
  6.                                BrunoK IgnoreSpaces |   5.744 ms | result 0
  7.                               BrunoKb IgnoreSpaces |   5.031 ms | result 0
  8.                               BrunoK2 IgnoreSpaces |   9.424 ms | result 0
  9.                              Delphius IgnoreSpaces |   7.968 ms | result 0
  10.                                PawelD IgnoreSpaces |   6.883 ms | result 0
  11.                            Fibonacci2 IgnoreSpaces |   5.164 ms | result 0
  12.                            Fibonacci4 IgnoreSpaces |  11.451 ms | result 0
  13.                            Benibela_K IgnoreSpaces |   4.692 ms | result 0
  14.                              Benibela IgnoreSpaces |   6.911 ms | result 0
  15.                             ALLIGATOR IgnoreSpaces |  12.768 ms | result 50000
  16.                        Silvercoder70B IgnoreSpaces |   8.732 ms | result 0
  17.                                   Avk IgnoreSpaces |   4.781 ms | result 0
  18.                                  Avk2 IgnoreSpaces |   5.070 ms | result 0
  19.                                  Josh IgnoreSpaces |  40.224 ms | result 50000

Bart

  • Hero Member
  • *****
  • Posts: 5465
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #172 on: November 01, 2024, 11:30:11 pm »
Hmmm... please take a look, I took the test from some of the first pages of the topic and I'm testing on it....

Just ran a single test on your code:
Code: Pascal  [Select][+][-]
  1. begin
  2.   //
  3.   InitValidAnagrams;
  4.   InitInvalidAnagrams;
  5.   InitFuncs;
  6.   InitUserCode;
  7.  
  8.   writeln('IsAnagram_Alligator(''1234567890'',''0 1 2 3 4 5 6 7 8 9 '',True, False) -> ',IsAnagram_Alligator('1234567890','0 1 2 3 4 5 6 7 8 9 ',True, False)); exit;
  9. ...
Output:
Code: [Select]
C:\Users\Bart\LazarusProjecten\bugs\forum\anagram>anagram
IsAnagram_Alligator('1234567890','0 1 2 3 4 5 6 7 8 9 ',True, False) -> FALSE

Bart

alpine

  • Hero Member
  • *****
  • Posts: 1289
Re: Contest: fastest IsAnagram function
« Reply #173 on: November 02, 2024, 04:19:31 am »
Here is mine:
Code: Pascal  [Select][+][-]
  1. function IsAnagram_alpine(const S1, S2: String; IgnoreSpaces: Boolean = True;
  2.   ExceptionOnError: Boolean = False): Boolean;
  3. const
  4.   M: array[0..127] of SmallInt = (
  5.     0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  6.     0, 0, 0, 0, 0, 0, 0, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
  7.     46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64,
  8.     65{A}, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,
  9.     83, 84, 85, 86, 87, 88, 89, 90{Z}, 91, 92, 93, 94, 95, 96, 65{a}, 66, 67,
  10.     68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
  11.     87, 88, 89, 90{z}, 97, 98, 99, 100, 101);
  12. var
  13.   I, L1, L2: SizeInt;
  14.   F: array[0..104] of SmallInt;
  15.   DW: array[0..36] of DWord absolute F[32];
  16.   ZW: array[0..15] of DWord absolute F;
  17. begin
  18.   ZW[ 0] := 0;
  19.   DW[ 0] := 0; DW[ 1] := 0; DW[ 2] := 0; DW[ 3] := 0; DW[ 4] := 0; DW[ 5] := 0;
  20.   DW[ 6] := 0; DW[ 7] := 0; DW[ 8] := 0; DW[ 9] := 0; DW[10] := 0; DW[11] := 0;
  21.   DW[12] := 0; DW[13] := 0; DW[14] := 0; DW[15] := 0; DW[16] := 0; DW[17] := 0;
  22.   DW[18] := 0; DW[19] := 0; DW[20] := 0; DW[21] := 0; DW[22] := 0; DW[23] := 0;
  23.   DW[24] := 0; DW[25] := 0; DW[26] := 0; DW[27] := 0; DW[28] := 0; DW[29] := 0;
  24.   DW[30] := 0; DW[31] := 0; DW[32] := 0; DW[33] := 0; DW[34] := 0; DW[35] := 0;
  25.   DW[36] := 0;
  26.  
  27.   I := 1;
  28.   L1 := Length(S1);
  29.   L2 := Length(S2);
  30.  
  31.   if L1 < L2 then
  32.   begin
  33.     while I <= L1 do
  34.     begin
  35.       Inc(F[M[Ord(S1[I])]]);
  36.       Dec(F[M[Ord(S2[I])]]);
  37.       Inc(I);
  38.     end;
  39.     while I <= L2 do
  40.     begin
  41.       Dec(F[M[Ord(S2[I])]]);
  42.       Inc(I);
  43.     end;
  44.   end
  45.   else
  46.   begin
  47.     while I <= L2 do
  48.     begin
  49.       Inc(F[M[Ord(S2[I])]]);
  50.       Dec(F[M[Ord(S1[I])]]);
  51.       Inc(I);
  52.     end;
  53.     while I <= L1 do
  54.     begin
  55.       Dec(F[M[Ord(S1[I])]]);
  56.       Inc(I);
  57.     end;
  58.   end;
  59.  
  60.   if ExceptionOnError and (F[0] <> 0) then
  61.     raise ERangeError.Create('Not an anagram.');
  62.  
  63.   if IgnoreSpaces then
  64.     F[32] := 0;
  65.  
  66.   if (DW[ 0] <> 0) or (DW[ 1] <> 0) or (DW[ 2] <> 0) or (DW[ 3] <> 0) or
  67.      (DW[ 4] <> 0) or (DW[ 5] <> 0) or (DW[ 6] <> 0) or (DW[ 7] <> 0) or
  68.      (DW[ 8] <> 0) or (DW[ 9] <> 0) or (DW[10] <> 0) or (DW[11] <> 0) or
  69.      (DW[12] <> 0) or (DW[13] <> 0) or (DW[14] <> 0) or (DW[15] <> 0) or
  70.      (DW[16] <> 0) or (DW[17] <> 0) or (DW[18] <> 0) or (DW[19] <> 0) or
  71.      (DW[20] <> 0) or (DW[21] <> 0) or (DW[22] <> 0) or (DW[23] <> 0) or
  72.      (DW[24] <> 0) or (DW[25] <> 0) or (DW[26] <> 0) or (DW[27] <> 0) or
  73.      (DW[28] <> 0) or (DW[29] <> 0) or (DW[30] <> 0) or (DW[31] <> 0) or
  74.      (DW[32] <> 0) or (DW[33] <> 0) or (DW[34] <> 0) or (DW[35] <> 0) or
  75.      (DW[36] <> 0)
  76.   then
  77.     Exit(False);
  78.  
  79.   Result := True;
  80. end;
  81.  

With a slight borrowing of BeniBela's up-casing trick. :-[
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

Josh

  • Hero Member
  • *****
  • Posts: 1344
Re: Contest: fastest IsAnagram function
« Reply #174 on: November 02, 2024, 04:38:39 am »
Hi Bart

Version 2 of my routine, should be faster, removed all the exits, so it flows better, maybe not the slowest

Code: Pascal  [Select][+][-]
  1. function IsAnAnagramJosh(const s1, s2: AnsiString; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): boolean;  inline;
  2. var
  3.   //DEclaring initial values speeds up
  4.   k:int32=0;
  5.   Len_S1:int32=0;
  6.   Len_S2:Int32=0;
  7.   v:byte=0;
  8.  
  9. begin
  10.   {$define UseInc}
  11.  // {$define UseOrd}
  12.   Len_S1:=Length(s1);
  13.   Len_S2:=Length(s2);
  14.   Result:=Not ((Not IgnoreSpaces and ((Len_S1<>Len_S2))) or (Len_S1 = 0) or (Len_S2 = 0));
  15.   If Result Then
  16.   begin
  17.     for k:=1 to Len_S1 do
  18.     begin
  19.       v:={$ifdef useinc}Ord{$Else}Byte{$Endif}(S1[k]);
  20.       if v in [josh_Arr_Start..josh_Arr_End] then {$ifdef useinc}inc(josh_FreqArray[josh_Map[v]]){$else}josh_FreqArray[josh_Map[v]]:=josh_FreqArray[josh_Map[v]]+1{$endif}
  21.       else
  22.       begin
  23.         Result:=False;
  24.         If ExceptionOnError then raise Exception.CreateFmt(josh_mess_s1,[k]);
  25.         break;
  26.       end;
  27.     end;
  28.     If Result Then
  29.     Begin
  30.       for k:=1 to Len_S2 do
  31.       begin
  32.         v:={$ifdef useinc}Ord{$Else}Byte{$Endif}(S2[k]);
  33.         if v in [josh_Arr_Start..josh_Arr_End] then {$ifdef useinc}dec(josh_FreqArray[josh_Map[v]]){$else}josh_FreqArray[josh_Map[v]]:=josh_FreqArray[josh_Map[v]]-1{$endif}
  34.         else
  35.         begin
  36.           Result:=False;
  37.           If ExceptionOnError then raise Exception.CreateFmt(josh_mess_s2,[k]);
  38.           break;
  39.         end;
  40.       end;
  41.       If Result then
  42.       begin
  43.         if IgnoreSpaces then josh_FreqArray[josh_Arr_Start]:=0;
  44.         for k:=josh_Arr_Start to $60 do     // space to Z
  45.         begin
  46.           if josh_FreqArray[k]<>0 then
  47.           begin
  48.             Result:=False;
  49.             josh_FreqArray[k]:=0;
  50.           end;
  51.         end;
  52.         if result then
  53.         begin
  54.           for k:=$7b to josh_Arr_End do      //{-⌂
  55.           begin
  56.             if josh_FreqArray[k]<>0 then
  57.             begin
  58.               Result:=False;
  59.               josh_FreqArray[k]:=0;
  60.             end;
  61.           end;
  62.         end;
  63.       end;
  64.     end;
  65.   end;
  66. end;        
  67.  
The best way to get accurate information on the forum is to post something wrong and wait for corrections.

BeniBela

  • Hero Member
  • *****
  • Posts: 920
    • homepage
Re: Contest: fastest IsAnagram function
« Reply #175 on: November 02, 2024, 03:26:29 pm »

Indeed.

Code: [Select]
Testing speed
Bart           :   375
Bart2          :   313
Warfly         :   297
Fibonacci      : Failed validity test
ASerge         :   234
Zvoni          : Failed validity test
Zvoni2         : Failed validity test
Alligator      : Failed validity test
SilverCoder    :   265
AVK            :   125
Paweld         :   250
BrunoK         :   235
Delphius       :   234
Benibela       :   156
Josh           :   391
Nek            : Failed validity test
Martin         : Failed validity test
Dummy          : Failed validity test

Bart


you have posted this three times, and each time all entries become faster?


Code: Text  [Select][+][-]
  1.                            Benibela_K IgnoreSpaces |   4.692 ms | result 0
  2.                              Benibela IgnoreSpaces |   6.911 ms | result 0

what is a Benibela_K?

BrunoK

  • Hero Member
  • *****
  • Posts: 623
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #176 on: November 02, 2024, 05:13:19 pm »


what is a Benibela_K?
Infantile modification with
Code: Pascal  [Select][+][-]
  1. if (length(s1) < 255) and (length(s2) < 255) then begin
  2.     countLittle(F1l, S1);
  3.     if not isinvalid then
  4.       countLittle(F2l, S2);

Bart

  • Hero Member
  • *****
  • Posts: 5465
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #177 on: November 02, 2024, 05:46:08 pm »
you have posted this three times, and each time all entries become faster?

As explained before speed runs can vary up to 15% (maybe even more), most likely depending on the stat of my computer when I run the test.
Some entries which are close to eachother mat swap places (when ranked).

Bart

Bart

  • Hero Member
  • *****
  • Posts: 5465
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #178 on: November 02, 2024, 05:49:54 pm »
Tested with new code from Alpine and updated code from Josh.

Code: [Select]
Testing speed
Bart           :   359
Bart2          :   297
Warfly         :   297
Fibonacci      : Failed validity test
ASerge         :   235
Zvoni          : Failed validity test
Zvoni2         : Failed validity test
Alligator      : Failed validity test
SilverCoder    :   297
AVK            :    94
Paweld         :   250
BrunoK         :   250
Delphius       :   234
Benibela       :   141
Josh           :   281
Nek            : Failed validity test
Alpine         : Failed validity test
Martin         : Failed validity test
Dummy          : Failed validity test

@alpine: sorry, but your code fails.
Code: [Select]
TestValidity for Alpine
FAIL to detect invalid character #0
Validitycheck FAIL for Alpine

Bart

alpine

  • Hero Member
  • *****
  • Posts: 1289
Re: Contest: fastest IsAnagram function
« Reply #179 on: November 02, 2024, 06:11:36 pm »
@Bart
I will fix it, of course, but since I've joined a bit late, can't I have a test bench source in some form. It is hard to orient in such a big topic... thanks!
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

 

TinyPortal © 2005-2018