Recent

Author Topic: Contest: fastest IsAnagram function  (Read 9481 times)

Bart

  • Hero Member
  • *****
  • Posts: 5467
    • Bart en Mariska's Webstek
Contest: fastest IsAnagram function
« on: October 23, 2024, 11:54:54 pm »
Hi,

I wrote a naïve function to check wether 2 strings are anagrams of eachother.
Of course this can be done faster.
Who will code the fastest one?

Prerequisites:
  • Strings are only allowed to contain lower ASCII >= #32
  • If this condition isn't met either the function returns False, or it raises an ERangeError, depending on the value of the ExceptionOnError parameter
  • Case is insensitive
  • Ignoring spaces will be optional (defaulting to True)
  • Solution must be cross-platorm
  • No assembly is allowed

The prototype of the function will be:
Code: Pascal  [Select][+][-]
  1. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;

My first attempt:
Code: Pascal  [Select][+][-]
  1. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  2. type
  3.   TFreq = array[#32..#127] of Integer;
  4. const
  5.   AllowedChars = [#32..#127];
  6. var
  7.   i,SpaceCnt: Integer;
  8.   F1, F2: TFreq;
  9.   Ch: Char;
  10. begin
  11.   Result := False;
  12.   F1 := Default(TFreq);
  13.   SpaceCnt := 0;
  14.   for i := 1 to Length(S1) do
  15.   begin
  16.     Ch := LowerCase(S1[i]);
  17.     if (Ch in AllowedChars) then
  18.     begin
  19.       if (Ch = #32) then
  20.         Inc(SpaceCnt)
  21.       else
  22.         Inc(F1[Ch]);
  23.     end
  24.     else
  25.     begin
  26.       if ExceptionOnError then
  27.         Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  28.       Exit;
  29.     end;
  30.   end;
  31.   F2 := Default(TFreq);
  32.   for i := 1 to Length(S2) do
  33.   begin
  34.     Ch := LowerCase(S2[i]);
  35.     if (Ch in AllowedChars) then
  36.     begin
  37.       if (Ch = #32) then
  38.         Dec(SpaceCnt)
  39.       else
  40.         Inc(F2[Ch]);
  41.     end
  42.     else
  43.     begin
  44.       if ExceptionOnError then
  45.         Raise ERangeError.CreateFmt('IsAnagram: illegal character in S2 at position %d',[i]);
  46.       Exit;
  47.     end;
  48.   end;
  49.   Result := IgnoreSpaces or (SpaceCnt = 0);
  50.   if Result then
  51.     Result := CompareMem(@F1, @F2, SizeOf(TFreq));
  52. end;

Who will win?
(Well, who will beat Martin? He won all my previous contest...)

Bart
« Last Edit: October 27, 2024, 12:24:48 am by Bart »

Fibonacci

  • Hero Member
  • *****
  • Posts: 604
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #1 on: October 24, 2024, 01:11:49 am »
EDIT: 1 mln iterations wasnt enough; changed to 5 mln

Code: Pascal  [Select][+][-]
  1. *** ROUND 1 ***
  2. s1 = St a    te
  3. s2 = tas t e
  4.  
  5.              Bart: IgnoreSpaces + ExceptionOnError |  277 ms | result 5000000
  6.         Fibonacci: IgnoreSpaces + ExceptionOnError |  136 ms | result 5000000
  7.                                 Bart: IgnoreSpaces |  279 ms | result 5000000
  8.                            Fibonacci: IgnoreSpaces |  139 ms | result 5000000
  9.                                               Bart |  236 ms | result 0
  10.                                          Fibonacci |  127 ms | result 0
  11.  
  12. *** ROUND 2 ***
  13. s1 = night
  14. s2 = THING
  15.  
  16.              Bart: IgnoreSpaces + ExceptionOnError |  225 ms | result 5000000
  17.         Fibonacci: IgnoreSpaces + ExceptionOnError |  105 ms | result 5000000
  18.                                 Bart: IgnoreSpaces |  238 ms | result 5000000
  19.                            Fibonacci: IgnoreSpaces |  106 ms | result 5000000
  20.                                               Bart |  240 ms | result 5000000
  21.                                          Fibonacci |  105 ms | result 5000000
  22.  
  23. *** ROUND 3: Invalid chars ***
  24. s1 = Invalid
  25. s2 = Diff length
  26.  
  27.              Bart: IgnoreSpaces + ExceptionOnError |  289 ms | result 0
  28.         Fibonacci: IgnoreSpaces + ExceptionOnError |  157 ms | result 0
  29.                                 Bart: IgnoreSpaces |  285 ms | result 0
  30.                            Fibonacci: IgnoreSpaces |  176 ms | result 0
  31.                                               Bart |  256 ms | result 0
  32.                                          Fibonacci |  154 ms | result 0

Code: Pascal  [Select][+][-]
  1. program app;
  2.  
  3. // on WINDOWS use QueryPerformanceCounter
  4. {$ifdef WINDOWS}
  5. uses SysUtils, Windows;
  6.  
  7. function GetTickCount64: QWord;
  8. const
  9.   freq: int64 = 0;
  10. begin
  11.   if freq = 0 then QueryPerformanceFrequency(@freq);
  12.   QueryPerformanceCounter(@result);
  13.   result := trunc((result / freq) * 1000);
  14. end;
  15. {$else}
  16. uses SysUtils;
  17. {$endif}
  18.  
  19. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  20. type
  21.   TFreq = array[#32..#127] of Integer;
  22. const
  23.   AllowedChars = [#32..#127];
  24. var
  25.   i,SpaceCnt: Integer;
  26.   F1, F2: TFreq;
  27.   Ch: Char;
  28. begin
  29.   Result := False;
  30.   F1 := Default(TFreq);
  31.   SpaceCnt := 0;
  32.   for i := 1 to Length(S1) do
  33.   begin
  34.     Ch := LowerCase(S1[i]);
  35.     if (Ch in AllowedChars) then
  36.     begin
  37.       if (Ch = #32) then
  38.         Inc(SpaceCnt)
  39.       else
  40.         Inc(F1[Ch]);
  41.     end
  42.     else
  43.     begin
  44.       if ExceptionOnError then
  45.         Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  46.       Exit;
  47.     end;
  48.   end;
  49.   F2 := Default(TFreq);
  50.   for i := 1 to Length(S2) do
  51.   begin
  52.     Ch := LowerCase(S2[i]);
  53.     if (Ch in AllowedChars) then
  54.     begin
  55.       if (Ch = #32) then
  56.         Dec(SpaceCnt)
  57.       else
  58.         Inc(F2[Ch]);
  59.     end
  60.     else
  61.     begin
  62.       if ExceptionOnError then
  63.         Raise ERangeError.CreateFmt('IsAnagram: illegal character in S2 at position %d',[i]);
  64.       Exit;
  65.     end;
  66.   end;
  67.   Result := IgnoreSpaces or (SpaceCnt = 0);
  68.   if Result then
  69.     Result := CompareMem(@F1, @F2, SizeOf(TFreq));
  70. end;
  71.  
  72. function IsAnagram_fibo(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  73. var
  74.   F1, F2: array[97..122] of Byte;
  75.   i, SpaceCnt: Integer;
  76.   Ch1, Ch2: Byte;
  77. begin
  78.   FillChar(F1, SizeOf(F1), 0);
  79.   FillChar(F2, SizeOf(F2), 0);
  80.   SpaceCnt := 0;
  81.   result := True;
  82.   for i := 1 to Length(S1) do begin
  83.     Ch1 := Ord(S1[i]);
  84.     if (Ch1 < 32) or (Ch1 > 127) then if ExceptionOnError then
  85.       Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d', [i])
  86.     else exit;
  87.     if Ch1 = 32 then begin
  88.       Inc(SpaceCnt);
  89.       continue;
  90.     end;
  91.     if Ch1 >= 65 then if Ch1 <= 90 then Inc(F1[Ch1 or $20]) else if Ch1 >= 97 then if Ch1 <= 122 then Inc(F1[Ch1]);
  92.   end;
  93.   for i := 1 to Length(S2) do begin
  94.     Ch2 := Ord(S2[i]);
  95.     if (Ch2 < 32) or (Ch2 > 127) then if ExceptionOnError then
  96.       Raise ERangeError.CreateFmt('IsAnagram: illegal character in S2 at position %d', [i])
  97.     else exit;
  98.     if Ch2 = 32 then begin
  99.       Dec(SpaceCnt);
  100.       continue;
  101.     end;
  102.     if Ch2 >= 65 then if Ch2 <= 90 then Inc(F2[Ch2 or $20]) else if Ch2 >= 97 then if Ch2 <= 122 then Inc(F2[Ch2]);
  103.   end;
  104.   result := (IgnoreSpaces or (SpaceCnt = 0)) and CompareMem(@F1, @F2, SizeOf(F1));
  105. end;
  106.  
  107. procedure main;
  108. const
  109.   ITERATIONS = 1000*1000*5;
  110. var
  111.   i, c: integer;
  112.   s, d: string;
  113.   u: ptruint;
  114. begin
  115.   s := 'St a    te';
  116.   d := 'tas t e';
  117.   writeln('*** ROUND 1 ***');
  118.   writeln('s1 = ', s);
  119.   writeln('s2 = ', d);
  120.   writeln;
  121.  
  122.   // test with invalid characters
  123.   //s += #1; d += #2;
  124.  
  125.   // -------------------------------------
  126.   write('Bart: IgnoreSpaces + ExceptionOnError':50);
  127.   c := 0;
  128.   u := GetTickCount64;
  129.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, true) then c += 1;
  130.   write(' | ', (GetTickCount64-u):4, ' ms');
  131.   write(' | result ', c);
  132.   writeln;
  133.   // -------------------------------------
  134.   write('Fibonacci: IgnoreSpaces + ExceptionOnError':50);
  135.   c := 0;
  136.   u := GetTickCount64;
  137.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, true) then c += 1;
  138.   write(' | ', (GetTickCount64-u):4, ' ms');
  139.   write(' | result ', c);
  140.   writeln;
  141.   // -------------------------------------
  142.   write('Bart: IgnoreSpaces':50);
  143.   c := 0;
  144.   u := GetTickCount64;
  145.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, false) then c += 1;
  146.   write(' | ', (GetTickCount64-u):4, ' ms');
  147.   write(' | result ', c);
  148.   writeln;  
  149.   // -------------------------------------  
  150.   write('Fibonacci: IgnoreSpaces':50);
  151.   c := 0;
  152.   u := GetTickCount64;
  153.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, false) then c += 1;
  154.   write(' | ', (GetTickCount64-u):4, ' ms');
  155.   write(' | result ', c);
  156.   writeln;
  157.   // -------------------------------------
  158.   write('Bart':50);
  159.   c := 0;
  160.   u := GetTickCount64;
  161.   for i := 1 to ITERATIONS do if IsAnagram(s, d, false, false) then c += 1;
  162.   write(' | ', (GetTickCount64-u):4, ' ms');
  163.   write(' | result ', c);
  164.   writeln;
  165.   // -------------------------------------
  166.   write('Fibonacci':50);
  167.   c := 0;
  168.   u := GetTickCount64;
  169.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, false, false) then c += 1;
  170.   write(' | ', (GetTickCount64-u):4, ' ms');
  171.   write(' | result ', c);
  172.   writeln;
  173.   // -------------------------------------
  174.  
  175.   writeln;
  176.  
  177.   s := 'night';
  178.   d := 'THING';
  179.   writeln('*** ROUND 2 ***');
  180.   writeln('s1 = ', s);
  181.   writeln('s2 = ', d);
  182.   writeln;
  183.  
  184.   // -------------------------------------
  185.   write('Bart: IgnoreSpaces + ExceptionOnError':50);
  186.   c := 0;
  187.   u := GetTickCount64;
  188.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, true) then c += 1;
  189.   write(' | ', (GetTickCount64-u):4, ' ms');
  190.   write(' | result ', c);
  191.   writeln;
  192.   // -------------------------------------
  193.   write('Fibonacci: IgnoreSpaces + ExceptionOnError':50);
  194.   c := 0;
  195.   u := GetTickCount64;
  196.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, true) then c += 1;
  197.   write(' | ', (GetTickCount64-u):4, ' ms');
  198.   write(' | result ', c);
  199.   writeln;
  200.   // -------------------------------------
  201.   write('Bart: IgnoreSpaces':50);
  202.   c := 0;
  203.   u := GetTickCount64;
  204.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, false) then c += 1;
  205.   write(' | ', (GetTickCount64-u):4, ' ms');
  206.   write(' | result ', c);
  207.   writeln;
  208.   // -------------------------------------
  209.   write('Fibonacci: IgnoreSpaces':50);
  210.   c := 0;
  211.   u := GetTickCount64;
  212.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, false) then c += 1;
  213.   write(' | ', (GetTickCount64-u):4, ' ms');
  214.   write(' | result ', c);
  215.   writeln;
  216.   // -------------------------------------
  217.   write('Bart':50);
  218.   c := 0;
  219.   u := GetTickCount64;
  220.   for i := 1 to ITERATIONS do if IsAnagram(s, d, false, false) then c += 1;
  221.   write(' | ', (GetTickCount64-u):4, ' ms');
  222.   write(' | result ', c);
  223.   writeln;
  224.   // -------------------------------------
  225.   write('Fibonacci':50);
  226.   c := 0;
  227.   u := GetTickCount64;
  228.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, false, false) then c += 1;
  229.   write(' | ', (GetTickCount64-u):4, ' ms');
  230.   write(' | result ', c);
  231.   writeln;
  232.   // -------------------------------------
  233.  
  234.   writeln;
  235.  
  236.   s := 'Invalid';
  237.   d := 'Diff length';
  238.   writeln('*** ROUND 3: Invalid chars ***');
  239.   writeln('s1 = ', s);
  240.   writeln('s2 = ', d);
  241.   writeln;
  242.  
  243.   // -------------------------------------
  244.   write('Bart: IgnoreSpaces + ExceptionOnError':50);
  245.   c := 0;
  246.   u := GetTickCount64;
  247.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, true) then c += 1;
  248.   write(' | ', (GetTickCount64-u):4, ' ms');
  249.   write(' | result ', c);
  250.   writeln;
  251.   // -------------------------------------
  252.   write('Fibonacci: IgnoreSpaces + ExceptionOnError':50);
  253.   c := 0;
  254.   u := GetTickCount64;
  255.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, true) then c += 1;
  256.   write(' | ', (GetTickCount64-u):4, ' ms');
  257.   write(' | result ', c);
  258.   writeln;
  259.   // -------------------------------------
  260.   write('Bart: IgnoreSpaces':50);
  261.   c := 0;
  262.   u := GetTickCount64;
  263.   for i := 1 to ITERATIONS do if IsAnagram(s, d, true, false) then c += 1;
  264.   write(' | ', (GetTickCount64-u):4, ' ms');
  265.   write(' | result ', c);
  266.   writeln;
  267.   // -------------------------------------
  268.   write('Fibonacci: IgnoreSpaces':50);
  269.   c := 0;
  270.   u := GetTickCount64;
  271.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, true, false) then c += 1;
  272.   write(' | ', (GetTickCount64-u):4, ' ms');
  273.   write(' | result ', c);
  274.   writeln;
  275.   // -------------------------------------
  276.   write('Bart':50);
  277.   c := 0;
  278.   u := GetTickCount64;
  279.   for i := 1 to ITERATIONS do if IsAnagram(s, d, false, false) then c += 1;
  280.   write(' | ', (GetTickCount64-u):4, ' ms');
  281.   write(' | result ', c);
  282.   writeln;
  283.   // -------------------------------------
  284.   write('Fibonacci':50);
  285.   c := 0;
  286.   u := GetTickCount64;
  287.   for i := 1 to ITERATIONS do if IsAnagram_fibo(s, d, false, false) then c += 1;
  288.   write(' | ', (GetTickCount64-u):4, ' ms');
  289.   write(' | result ', c);
  290.   writeln;
  291.   // -------------------------------------
  292.  
  293.   readln;
  294. end;
  295.  
  296. begin
  297.   main;
  298. end.
« Last Edit: October 24, 2024, 03:05:50 am by Fibonacci »

ASerge

  • Hero Member
  • *****
  • Posts: 2337
Re: Contest: fastest IsAnagram function
« Reply #2 on: October 24, 2024, 02:34:59 am »
An obvious mistake in IsAnagram_fibo:
Code: Pascal  [Select][+][-]
  1. var
  2.   F1, F2: array[97..122] of Byte; // Start from 97
  3. ...
  4.   if Ch1 = 32 then begin
  5.     if not IgnoreSpaces then Inc(F1[Ch1]); // 32 < 97. Out of index!
  6.  

Fibonacci

  • Hero Member
  • *****
  • Posts: 604
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #3 on: October 24, 2024, 02:42:59 am »
Thanks, fixed ;D

Also updated the code to use the more accurate QueryPerformanceCounter on Windows.
« Last Edit: October 24, 2024, 02:52:35 am by Fibonacci »

ASerge

  • Hero Member
  • *****
  • Posts: 2337
Re: Contest: fastest IsAnagram function
« Reply #4 on: October 24, 2024, 03:43:04 am »
Thanks, fixed ;D
Your code still does not fulfill the condition: "Strings are only allowed to contain lower ASCII >= #32" for chars in range #33..'A'.

My version:
Code: Pascal  [Select][+][-]
  1. function IsAnagramASerge(const S1, S2: string; IgnoreSpaces: Boolean = True;
  2.   ExceptionOnError: Boolean = False): Boolean;
  3.  
  4.   procedure Error(const Where: string; AtPos: SizeInt); noreturn;
  5.   begin
  6.     raise ERangeError.CreateFmt(
  7.       'IsAnagram: illegal character in %s at position %d', [Where, AtPos]);
  8.   end;
  9.  
  10. type
  11.   TFreq = array[33..127] of Integer;
  12.  
  13.   function FillOk(const S: string; out Data: TFreq; out SpaceCnt, ErrPos: Integer): Boolean;
  14.   var
  15.     i: Integer;
  16.     B: Byte;
  17.   begin
  18.     FillChar(Data, SizeOf(Data), 0);
  19.     SpaceCnt := 0;
  20.     for i := 1 to Length(S) do
  21.     begin
  22.       B := Ord(S[i]);
  23.       case B of
  24.         32: Inc(SpaceCnt);
  25.         33..Pred(Ord('A')): Inc(Data[B]);
  26.         Ord('A')..Ord('Z'): Inc(Data[B or $20]);
  27.         Ord(Succ('Z'))..127: Inc(Data[B]);
  28.       else
  29.         ErrPos := i;
  30.         Exit(False);
  31.       end;
  32.     end;
  33.     ErrPos := 0;
  34.     Result := True;
  35.   end;
  36.  
  37. var
  38.   F1, F2: TFreq;
  39.   SpaceCnt1, SpaceCnt2, ErrPos: Integer;
  40. begin
  41.   if not FillOk(S1, F1, SpaceCnt1, ErrPos) then
  42.     if ExceptionOnError then
  43.       Error('S1', ErrPos)
  44.     else
  45.       Exit(False);
  46.   if not FillOk(S2, F2, SpaceCnt2, ErrPos) then
  47.     if ExceptionOnError then
  48.       Error('S2', ErrPos)
  49.     else
  50.       Exit(False);
  51.   Result := IgnoreSpaces or (SpaceCnt1 = SpaceCnt2);
  52.   if Result then
  53.     Result := CompareMem(@F1, @F2, SizeOf(TFreq));
  54. end;

My CPU is not that powerful, so the result is ten times fewer iterations:
Code: Text  [Select][+][-]
  1. *** ROUND 1 ***
  2. s1 = St a    te
  3. s2 = tas t e
  4.  
  5.              Bart: IgnoreSpaces + ExceptionOnError |  284 ms | result 500000
  6.            ASerge: IgnoreSpaces + ExceptionOnError |  189 ms | result 500000
  7.                                 Bart: IgnoreSpaces |  270 ms | result 500000
  8.                               ASerge: IgnoreSpaces |  186 ms | result 500000
  9.                                               Bart |   76 ms | result 0
  10.                                             ASerge |   48 ms | result 0
  11.  
  12. *** ROUND 2 ***
  13. s1 = night
  14. s2 = THING
  15.  
  16.              Bart: IgnoreSpaces + ExceptionOnError |  256 ms | result 500000
  17.            ASerge: IgnoreSpaces + ExceptionOnError |  181 ms | result 500000
  18.                                 Bart: IgnoreSpaces |  257 ms | result 500000
  19.                               ASerge: IgnoreSpaces |  179 ms | result 500000
  20.                                               Bart |  262 ms | result 500000
  21.                                             ASerge |  179 ms | result 500000
  22. *** ROUND 3: Invalid chars ***
  23. s1 = Invalid
  24. s2 = Diff length
  25.  
  26.              Bart: IgnoreSpaces + ExceptionOnError |  223 ms | result 0
  27.            ASerge: IgnoreSpaces + ExceptionOnError |  152 ms | result 0
  28.                                 Bart: IgnoreSpaces |  228 ms | result 0
  29.                               ASerge: IgnoreSpaces |  148 ms | result 0
  30.                                               Bart |   93 ms | result 0
  31.                                             ASerge |   56 ms | result 0
  32.  

Fibonacci

  • Hero Member
  • *****
  • Posts: 604
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #5 on: October 24, 2024, 04:36:16 am »
Last try. If its still incorrect I think Ill need to see examples of valid and invalid strings.

Code: Pascal  [Select][+][-]
  1. function IsAnagram_fibo(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  2. var
  3.   F1, F2: array[33..122] of Byte;
  4.   i, SpaceCnt: Integer;
  5.   Ch1, Ch2: Byte;
  6. begin
  7.   FillChar(F1, SizeOf(F1), 0);
  8.   FillChar(F2, SizeOf(F2), 0);
  9.   SpaceCnt := 0;
  10.   result := true;
  11.  
  12.   for i := 1 to Length(S1) do begin
  13.     Ch1 := Ord(S1[i]);
  14.     if (Ch1 < 33) or (Ch1 > 122) then if (Ch1 <> 32) and IgnoreSpaces then if ExceptionOnError then
  15.       Raise ERangeError.CreateFmt('IsAnagram: illegal character in S1 at position %d', [i])
  16.     else exit;
  17.     if Ch1 = 32 then Inc(SpaceCnt)
  18.     else if Ch1 >= 33 then if Ch1 <= 122 then Inc(F1[Ch1 or $20]) else Inc(F1[Ch1]);
  19.   end;
  20.  
  21.   for i := 1 to Length(S2) do begin
  22.     Ch2 := Ord(S2[i]);
  23.     if (Ch2 < 33) or (Ch2 > 122) then if (Ch2 <> 32) and IgnoreSpaces then if ExceptionOnError then
  24.       Raise ERangeError.CreateFmt('IsAnagram: illegal character in S2 at position %d', [i])
  25.     else exit;
  26.     if Ch2 = 32 then Dec(SpaceCnt)
  27.     else if Ch2 >= 33 then if Ch2 <= 122 then Inc(F2[Ch2 or $20]) else Inc(F2[Ch2]);
  28.   end;
  29.  
  30.   result := (IgnoreSpaces or (SpaceCnt = 0)) and CompareMem(@F1, @F2, SizeOf(F1));
  31. end;

Code: Text  [Select][+][-]
  1. *** ROUND 1 ***
  2. s1 = St a    te
  3. s2 = tas t e
  4.  
  5.              Bart: IgnoreSpaces + ExceptionOnError |  274 ms | result 5000000
  6.         Fibonacci: IgnoreSpaces + ExceptionOnError |  152 ms | result 5000000
  7.            ASerge: IgnoreSpaces + ExceptionOnError |  185 ms | result 5000000
  8.                                 Bart: IgnoreSpaces |  279 ms | result 5000000
  9.                            Fibonacci: IgnoreSpaces |  152 ms | result 5000000
  10.                               ASerge: IgnoreSpaces |  187 ms | result 5000000
  11.                                               Bart |  239 ms | result 0
  12.                                          Fibonacci |  136 ms | result 0
  13.                                             ASerge |  140 ms | result 0

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #6 on: October 24, 2024, 06:19:13 am »
Code: Pascal  [Select][+][-]
  1.   if freq = 0 then QueryPerformanceFrequency(@freq);
  2.   QueryPerformanceCounter(@result);
and
Quote
Solution must be cross platform
Someone needs to be disqualified.
If I smell bad code it usually is bad code and that includes my own code.

Khrys

  • Full Member
  • ***
  • Posts: 105
Re: Contest: fastest IsAnagram function
« Reply #7 on: October 24, 2024, 06:51:55 am »
Code: Pascal  [Select][+][-]
  1.   if freq = 0 then QueryPerformanceFrequency(@freq);
  2.   QueryPerformanceCounter(@result);
and
Quote
Solution must be cross platform
Someone needs to be disqualified.

The profiling mechanism isn't part of the solution

BrunoK

  • Hero Member
  • *****
  • Posts: 623
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #8 on: October 24, 2024, 07:25:10 am »
Suggest to clarify :
  • Strings are only allowed to contain lower ASCII >= #32 and <=127
  • Alphabetic characters are case insensitive
See lines 16 and 34.

Ah,yes, who will beat Martin ?

Roland57

  • Sr. Member
  • ****
  • Posts: 475
    • msegui.net
Re: Contest: fastest IsAnagram function
« Reply #9 on: October 24, 2024, 07:41:40 am »
Nice and interesting code.

My contribution for a more compact test program.

Code: Pascal  [Select][+][-]
  1. uses
  2.   SysUtils;
  3.  
  4. type
  5.   IsAnagramFunc = function (const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  6.  
  7. // ...
  8.  
  9. procedure main;
  10. var
  11.   s, d: string;
  12.  
  13.   procedure Test(const AFunctionName: string; const AFunc: IsAnagramFunc; const IgnoreSpaces, ExceptionOnError: Boolean);
  14.   const
  15.     ITERATIONS = 1000*1000*5;
  16.     IgnoreSpacesStr: array[boolean] of string = ('', ' IgnoreSpaces');
  17.     ExceptionOnErrorStr: array[boolean] of string = ('', ' ExceptionOnError');
  18.   var
  19.     i, c: integer;
  20.     u: ptruint;
  21.   begin
  22.     write(Concat(AFunctionName, IgnoreSpacesStr[IgnoreSpaces], ExceptionOnErrorStr[ExceptionOnError]):50);
  23.     c := 0;
  24.     u := GetTickCount64;
  25.     for i := 1 to ITERATIONS do if AFunc(s, d, IgnoreSpaces, ExceptionOnError) then c += 1;
  26.     write(' | ', (GetTickCount64-u):4, ' ms');
  27.     write(' | result ', c);
  28.     writeln;
  29.   end;
  30.  
  31.   procedure TestAll;
  32.   begin
  33.     Test('Bart', @IsAnagram, true, true);
  34.     Test('Fibonacci', @IsAnagram_fibo, true, true);
  35.     Test('ASerge', @IsAnagramASerge, true, true);
  36.    
  37.     Test('Bart', @IsAnagram, true, false);
  38.     Test('Fibonacci', @IsAnagram_fibo, true, false);
  39.     Test('ASerge', @IsAnagramASerge, true, false);
  40.    
  41.     Test('Bart', @IsAnagram, false, false);
  42.     Test('Fibonacci', @IsAnagram_fibo, false, false);
  43.     Test('ASerge', @IsAnagramASerge, false, false);
  44.   end;
  45.  
  46. begin
  47.   s := 'St a    te';
  48.   d := 'tas t e';
  49.   writeln('*** ROUND 1 ***');
  50.   writeln('s1 = ', s);
  51.   writeln('s2 = ', d);
  52.   writeln;
  53.  
  54.   TestAll;
  55.   writeln;
  56.  
  57.   s := 'night';
  58.   d := 'THING';
  59.   writeln('*** ROUND 2 ***');
  60.   writeln('s1 = ', s);
  61.   writeln('s2 = ', d);
  62.   writeln;
  63.  
  64.   TestAll;
  65.   writeln;
  66.  
  67.   s := 'Invalid';
  68.   d := 'Diff length';
  69.   writeln('*** ROUND 3: Invalid chars ***');
  70.   writeln('s1 = ', s);
  71.   writeln('s2 = ', d);
  72.   writeln;
  73.  
  74.   TestAll;
  75. end;
  76.  
  77. begin
  78.   main;
  79. end.
  80.  
« Last Edit: October 24, 2024, 09:03:39 am by Roland57 »
My projects are on Gitlab and on Codeberg.

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #10 on: October 24, 2024, 08:51:48 am »
Given the range, or $20 might speed things up, since it limits the strings to lowercase and orring is fast.
(only valid for the specified range!)
Also test length first and use shortstring. (s[0] comparison)
« Last Edit: October 24, 2024, 08:58:47 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #11 on: October 24, 2024, 08:51:53 am »
I'm missing a "CaseInsensitive True/False"-Option....
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #12 on: October 24, 2024, 09:11:07 am »
Does it pass this test?
https://www.anagrammy.com/literary/rb/poems-rb16.html
Or this one: Todd Hayden King, which is my favorite... ;D ;D
« Last Edit: October 24, 2024, 09:15:08 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Zvoni

  • Hero Member
  • *****
  • Posts: 2741
Re: Contest: fastest IsAnagram function
« Reply #13 on: October 24, 2024, 09:44:54 am »
Fooling around.... Just PoC of my Algorithm

And no Exception-Raising-stuff
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. {$mode objfpc}{$H+}
  3. Uses Sysutils;
  4.  
  5. Var
  6.   os1,os2:String;
  7.   b:Boolean;
  8.  
  9.  
  10. procedure QuickSort(var AI: array of Char; ALo, AHi: Integer);
  11. var
  12.   Pivot,T: Char;
  13.   Lo, Hi:Integer;
  14. begin
  15.   Lo := ALo;
  16.   Hi := AHi;
  17.   Pivot := AI[(Lo + Hi) div 2];
  18.   repeat
  19.     while AI[Lo] < Pivot do
  20.       Inc(Lo) ;
  21.     while AI[Hi] > Pivot do
  22.       Dec(Hi) ;
  23.     if Lo <= Hi then
  24.     begin
  25.       T := AI[Lo];
  26.       AI[Lo] := AI[Hi];
  27.       AI[Hi] := T;
  28.       Inc(Lo) ;
  29.       Dec(Hi) ;
  30.     end;
  31.   until Lo > Hi;
  32.   if Hi > ALo then
  33.     QuickSort(AI, ALo, Hi) ;
  34.   if Lo < AHi then
  35.     QuickSort(AI, Lo, AHi) ;
  36. end;
  37. Function StrSpn(Const str:PChar;Const Accept:PChar):Integer;
  38. Var
  39.   a:PChar;
  40.   table:Array[0..255] Of Byte;
  41.   p:PByte;
  42.   c0,c1,c2,c3:ByteBool;
  43.   s:PChar;
  44.   Count:Integer;
  45. Begin
  46.   If Accept[0]=#0 Then Exit(0);
  47.   If Accept[1]=#0 Then
  48.     Begin
  49.       a:=str;
  50.       While a^=accept^ Do Inc(a);
  51.       Exit(a-str);
  52.     end;
  53.   FillChar(table,64,0);
  54.   p:=@table[0];
  55.   FillChar(table[64],64,0);
  56.   FillChar(table[128],64,0);
  57.   FillChar(table[192],64,0);
  58.   s:=accept;
  59.   While s^<>#0 Do
  60.     Begin
  61.       p[Byte(s^)]:=1;
  62.       Inc(s);
  63.     end;
  64.   s:=str;
  65.   If Not ByteBool(p[Byte(s[0])]) Then Exit(0);
  66.   If Not ByteBool(p[Byte(s[1])]) Then Exit(1);
  67.   If Not ByteBool(p[Byte(s[2])]) Then Exit(2);
  68.   If Not ByteBool(p[Byte(s[3])]) Then Exit(3);
  69.   Repeat
  70.     Inc(s,4);
  71.     c0:=ByteBool(p[Byte(s[0])]);
  72.     c1:=ByteBool(p[Byte(s[1])]);
  73.     c2:=byteBool(p[Byte(s[2])]);
  74.     c3:=ByteBool(p[Byte(s[3])]);
  75.   until Not (c0 And C1 And C2 And C3);
  76.   Count:=s-str;
  77.   If Not (c0 And c1) Then
  78.     Result:=count+Byte(c0)
  79.   Else
  80.     Result:=Count+Byte(c2)+2;
  81. End;
  82.  
  83. function IsAnagram(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  84. Var
  85.   ls1,ls2:String;
  86.   ps1,ps2:Array Of AnsiChar;
  87.   i:Integer;
  88.   p1,p2:PChar;
  89.   by:Byte;
  90.   c:Array[32..127] Of Char;    //Legal Characters
  91. Begin
  92.   Result:=False;
  93.   For by:=32 To 127 Do c[by]:=Char(by);
  94.   If IgnoreSpaces Then
  95.     Begin
  96.       ls1:=StringReplace(S1,' ','',[rfReplaceAll]);
  97.       ls2:=StringReplace(S2,' ','',[rfReplaceAll]);
  98.     end
  99.   Else
  100.     Begin
  101.       ls1:=S1;
  102.       ls2:=S2;
  103.     End;
  104.   If Length(ls1)<>Length(ls2) Then Exit;  //unequal Length.
  105.   p1:=PChar(ls1);
  106.   p2:=PChar(ls2);
  107.   //Writeln(Length(ls1));
  108.   i:=StrSpn(p1,PChar(@c[32]));
  109.   If i<>Length(ls1) Then Exit; //Illegal char
  110.   i:=StrSpn(p2,PChar(@c[32]));
  111.   If i<>Length(ls2) Then Exit; //Illegal char
  112.   SetLength(ps1,Length(ls1));
  113.   SetLength(ps2,Length(ls2));
  114.   Move(p1,ps1,Length(ps1));
  115.   Move(p2,ps2,Length(ps2));
  116.   QuickSort(ps1,Low(ps1),High(ps1));
  117.   QuickSort(ps2,Low(ps2),High(ps2));
  118.   Result:=CompareMem(@ps1[0],@ps2[0],Length(ps1));
  119. End;
  120.  
  121. begin
  122.   os1:='TOM MARVOLO RIDDLE';
  123.   os2:='I AM LORD VOLDEMORT';
  124.   b:=IsAnagram(os1,os2);
  125.   Writeln(b);
  126. end.
« Last Edit: October 24, 2024, 10:12:15 am by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 16177
  • Censorship about opinions does not belong here.
Re: Contest: fastest IsAnagram function
« Reply #14 on: October 24, 2024, 10:16:01 am »
Using C'sms must be slower, I will have a go.. The sort is good, but even a bubble sort would beat it on short strings...
Aarg, first more coffee.(why do you need the moves?)
« Last Edit: October 24, 2024, 10:20:32 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

 

TinyPortal © 2005-2018