Recent

Author Topic: Contest: fastest IsAnagram function  (Read 18302 times)

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Contest: fastest IsAnagram function
« Reply #75 on: October 25, 2024, 06:57:53 am »
The last part can be simply:
Code: Pascal  [Select][+][-]
  1. function ToLowerCase(const c: Char): Char;inline;
  2. begin
  3.   Result := Chr(Ord(c) or $20)
  4. end;
That is, given the range. Resolves to 2 instructions.( I had hoped for one)
My take, not yet optimized:
Code: Pascal  [Select][+][-]
  1. program ana;
  2. {$ifdef fpc}{$mode objfpc}{$endif}{$H-}
  3.  
  4. procedure InsertionSort(var s: shortstring);inline;
  5. var
  6.   i, j: Integer;
  7.   key: Char;
  8. begin
  9.   for i := 2 to ord(s[0]) do
  10.   begin
  11.     key := s[i];
  12.     j := i - 1;
  13.     while (j > 0) and (s[j] > key) do
  14.     begin
  15.       s[j + 1] := s[j];
  16.       j := j - 1;
  17.     end;
  18.     s[j + 1] := key;
  19.   end;
  20. end;
  21.  
  22.  
  23. function LowerCase(const s:shortstring ): shortstring;inline;
  24. var
  25.   i:integer;
  26. begin
  27.   result[0]:=s[0];
  28.   for i := 1 to ord(s[0]) do
  29.   Result[i] := Chr(Ord(s[i]) or $20)
  30. end;
  31.  
  32. function IsAnagram(const a,b:shortstring):boolean;inline;
  33. var
  34.   c,d:shortstring;
  35. begin
  36.   result := false;
  37.   if a[0] = b[0] then
  38.   begin
  39.     c:=lowercase(a);
  40.     d:=lowercase(b);
  41.     InsertionSort(c);
  42.     InsertionSort(d);
  43.     if c = d then result := true;
  44.   end;  
  45. end;
  46.  
  47. begin
  48.   writeln(IsAnagram('Thaddy de Koning','Todd Hayden King'));
  49. end.
« Last Edit: October 25, 2024, 07:58:16 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

Fibonacci

  • Hero Member
  • *****
  • Posts: 788
  • Internal Error Hunter
Re: Contest: fastest IsAnagram function
« Reply #76 on: October 25, 2024, 07:06:13 am »
The last part can be simply:
Code: Pascal  [Select][+][-]
  1. function ToLowerCase(const c: Char): Char;inline;
  2. begin
  3.   Result := Chr(Ord(c) or $20)
  4. end;
That is, given the range. Resolves to 2 instructions.

Code: Pascal  [Select][+][-]
  1. function ToLowerCase(const c: Char): Char;inline;
  2. begin
  3.   Result := Chr(Ord(c) or $20)
  4. end;
  5.  
  6. procedure main;
  7. var
  8.   s: string;
  9.   i: integer;
  10. begin
  11.   s := 'Not_that_FAST';
  12.   for i := 1 to high(s) do s[i] := ToLowerCase(s[i]);
  13.   writeln(s);
  14.   readln;
  15. end;
  16.  
  17. begin
  18.   main;
  19. end.

Is "_" in the allowed chars? I dont know anymore. My sleep time coming ;)
« Last Edit: October 25, 2024, 07:08:24 am by Fibonacci »

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Contest: fastest IsAnagram function
« Reply #77 on: October 25, 2024, 07:41:05 am »
I added my take, posts crossed. I used a simple insertion sort because it has good performance on shorter strings.
My solution should outperform most others and is short. (but I am too lazy to time it)
One optimization is to do the casing inside the sort.
« Last Edit: October 25, 2024, 07:53:52 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

BrunoK

  • Hero Member
  • *****
  • Posts: 766
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #78 on: October 25, 2024, 07:54:35 am »
@bart

Nos does my entry check with pour tests ?

Bart

  • Hero Member
  • *****
  • Posts: 5691
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #79 on: October 25, 2024, 08:30:29 am »
I added my take, posts crossed.

As I said (and proofed) before, your ToLowerCase function is NOT valid over the entire allowed range of characters.
It will therefor fail the validity test.

It may be fast, but it is wrong.

Bart

Bart

  • Hero Member
  • *****
  • Posts: 5691
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #80 on: October 25, 2024, 08:32:02 am »
@bart
Nos does my entry check with pour tests ?

It's in post nr ? (at page ?)

I'll check later, am @work now, no fpc available here...

Bart

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Contest: fastest IsAnagram function
« Reply #81 on: October 25, 2024, 08:37:55 am »
I added my take, posts crossed.

As I said (and proofed) before, your ToLowerCase function is NOT valid over the entire allowed range of characters.
It will therefor fail the validity test.

It may be fast, but it is wrong.

Bart
You specified Strings are only allowed to contain lower ASCII >= #32 and ASCII ends at 127. which always resolves valid with or $20.
I know my solution fails, because I did not add the option to ignore spaces, but it is the fastest yet, even in its current state,
Do not change the rules during the game.
And I should get bonus points for not relying on anything else than system. :-* :-X
« Last Edit: October 25, 2024, 08:47:20 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

BrunoK

  • Hero Member
  • *****
  • Posts: 766
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #82 on: October 25, 2024, 09:13:41 am »
@bart
Nos does my entry check with pour tests ?

It's in post nr ? (at page ?)

I'll check later, am @work now, no fpc available here...

Bart
Complicated but fairly competitive if compile -O2 or above.
Code: Pascal  [Select][+][-]
  1. { Pretty fast  when compiled with > O2 compiler switch }
  2. function IsAnagram_BrunoK(const S1, S2: String; IgnoreSpaces: Boolean = True;
  3.   ExceptionOnError: Boolean = False): Boolean;
  4. const
  5.   cCold: boolean = True;          // WarmUp done
  6.   cArIndexStr: shortstring = '';  // Holder for indexes by character
  7.  
  8.   cFlagArraySize = ((2 + 128 - 32 - (Ord('z') - Ord('a') + 1)) *
  9.     SizeOf(SmallInt) + (SizeOf(SizeInt) - 1) div
  10.     (SizeOf(SizeInt) div 2));
  11.   cFlagArraySizeInt = (cFlagArraySize * 2) div SizeOf(SizeInt);
  12. var
  13.   vcArIndex: array[0..High(Byte)] of byte absolute cArIndexStr;
  14.  
  15.   procedure WarmUp;
  16.   var
  17.     i, j: integer;
  18.   begin
  19.     FillDWord(vcArIndex, SizeOf(vcArIndex) div SizeOf(DWord), 0); // Init array
  20.     vcArIndex[0] := 0;   // Invalid character counter
  21.     vcArIndex[1] := 1;   // Dead characters
  22.     j := 2;
  23.     for i := Ord(' ') to 128 - 1 do
  24.       if (i < Ord('a')) or (i > Ord('z')) then begin
  25.         vcArIndex[i] := j;
  26.         Inc(j);
  27.       end
  28.       else
  29.         vcArIndex[i] := vcArIndex[i - (Ord('a') - Ord('A'))];
  30.     cCold := False;
  31.   end;
  32.  
  33.   procedure Error(const Where: string; AtPos: SizeInt); noreturn;
  34.   begin
  35.     raise ERangeError.CreateFmt('IsAnagram: illegal character in %s at position %d',
  36.       [Where, AtPos]);
  37.   end;
  38.  
  39. var
  40.   F1: array[0..cFlagArraySizeInt - 1] of SizeInt;
  41.   F1SI: array[0..cFlagArraySize - 1] of SmallInt absolute F1;
  42.   i: integer;
  43.   j: integer;
  44.   vpIndex: PSmallInt;
  45. label
  46.   NextInS1;
  47. begin
  48.   if cCold then
  49.     WarmUp;
  50.   for i := 0 to cFlagArraySizeInt - 1 do   // Init array
  51.     F1[i] := 0;
  52.  
  53.   vcArIndex[Ord(' ')] := 2;
  54.   if IgnoreSpaces then
  55.     vcArIndex[Ord(' ')] := 1; // Send them to dead char
  56.  
  57.   { Increase counts for S1 }
  58.   for i := 1 to Length(S1) do begin
  59.     j := Ord(S1[i]);
  60.     vpIndex := @F1SI[vcArIndex[j]];
  61.     Inc(vpIndex^);
  62.     if j >= 1 then
  63.       Continue;
  64.     { Invalid charater }
  65.     Result := False;
  66.     if ExceptionOnError then
  67.       Error('S1', i);
  68.     Exit;
  69.   end;
  70.   F1SI[1] := High(SmallInt); // Do not fail due to deadchar's
  71.   { Decrease counts for S2 }
  72.   for i := 1 to Length(S2) do begin
  73.     j := byte(S2[i]);
  74.     vpIndex := @F1SI[vcArIndex[j]];
  75.     Dec(vpIndex^);
  76.     if vpIndex^ >= 0 then // All is well, processs next character
  77.       Continue;
  78.     { Invalid character or more counter become negative }
  79.     Result := False;
  80.     if ExceptionOnError and (j = 0) then
  81.       Error('S2', i);
  82.     Exit;
  83.   end;
  84.   F1SI[1] := 0;   // Ignore dead characters
  85.   for i := 0 to cFlagArraySizeInt - 1 do
  86.     if F1[i] <> 0 then
  87.       Exit(False);
  88.   Result := True;
  89. end;
IsAnagram_fibo passes
Quote
s1 = St a    te @
s2 = tas t e `
where ohters correctly reject it.

Bart

  • Hero Member
  • *****
  • Posts: 5691
    • Bart en Mariska's Webstek
Re: Contest: fastest IsAnagram function
« Reply #83 on: October 25, 2024, 10:06:11 am »
You specified Strings are only allowed to contain lower ASCII >= #32 and ASCII ends at 127. which always resolves valid with or $20.
I know my solution fails, because I did not add the option to ignore spaces, but it is the fastest yet, even in its current state,

If you use your or-ing with $20 then e.g. both ~ (tilde) and ^ (caret) will turn into ~.
So, your algo therefor assumes that ~~ is an angram of ^^.

Of course, I may be mistaken and my test (invertigating this) was wrong, in which case: I apologize.

Bart

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Contest: fastest IsAnagram function
« Reply #84 on: October 25, 2024, 10:37:08 am »
Accepted. will try to improve. It is a nice contest.
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

BrunoK

  • Hero Member
  • *****
  • Posts: 766
  • Retired programmer
Re: Contest: fastest IsAnagram function
« Reply #85 on: October 25, 2024, 11:54:34 am »
Corrected routine. Wrong invalid character detection on first string at line 61.
Replaced
 
Code: Pascal  [Select][+][-]
  1.    if j >= 1 then
  2.       Continue;
with
Code: Pascal  [Select][+][-]
  1.     if vpIndex<>@F1SI[0] then
  2.       Continue;

EDIT : Correction at 06:00 pm 25.10.24 related to same error in processing S2. Line 85
   

Code: Pascal  [Select][+][-]
  1. function IsAnagram_BrunoK(const S1, S2: String; IgnoreSpaces: Boolean = True;
  2.   ExceptionOnError: Boolean = False): Boolean;
  3.  
  4. const
  5.   cCold: boolean = True;          // WarmUp done -> cCold = False
  6.   cArIndexStr: shortstring = '';  // Holder for indexes by character
  7.  
  8.   cFlagArraySize = ((2 + 128 - 32 - (Ord('z') - Ord('a') + 1)) *
  9.     SizeOf(SmallInt) + (SizeOf(SizeInt) - 1) div
  10.     (SizeOf(SizeInt) div 2));
  11.   cFlagArraySizeInt = (cFlagArraySize * 2) div SizeOf(SizeInt);
  12. var
  13.   vcArIndex: array[0..High(Byte)] of byte absolute cArIndexStr;
  14.  
  15.   procedure WarmUp;
  16.   var
  17.     i, j: integer;
  18.   begin
  19.     FillDWord(vcArIndex, SizeOf(vcArIndex) div SizeOf(DWord), 0); // Init array
  20.     vcArIndex[0] := 0;   // Invalid character counter
  21.     vcArIndex[1] := 1;   // Dead characters
  22.     j := 2;
  23.     for i := Ord(' ') to 128 - 1 do
  24.       if (i < Ord('a')) or (i > Ord('z')) then begin
  25.         vcArIndex[i] := j;
  26.         Inc(j);
  27.       end
  28.       else
  29.         vcArIndex[i] := vcArIndex[i - (Ord('a') - Ord('A'))];
  30.     cCold := False;
  31.   end;
  32.  
  33.   procedure Error(const Where: string; AtPos: SizeInt); noreturn;
  34.   begin
  35.     raise ERangeError.CreateFmt('IsAnagram_bk: illegal character in %s at position %d',
  36.       [Where, AtPos]);
  37.   end;
  38. type
  39.   TFreq = array[0..cFlagArraySizeInt - 1] of SizeInt; // Trick from bart
  40.  
  41. var
  42.   F1: array[0..cFlagArraySizeInt - 1] of SizeInt;
  43.   F1SI: array[0..cFlagArraySize - 1] of SmallInt absolute F1;
  44.   i: integer;
  45.   j: integer;
  46.   vpIndex: PSmallInt;
  47.   pb, pe: PByte;
  48. begin
  49.   if cCold then
  50.     WarmUp;
  51.   F1 := Default(TFreq);  // Init array  // Sligth speed improvement
  52.   vcArIndex[Ord(' ')] := 2;
  53.   if IgnoreSpaces then
  54.     vcArIndex[Ord(' ')] := 1; // Send them to dead char
  55.  
  56.   { Increase counts for S1 }
  57.   { for i := 1 to Length(S1) do begin replace with while and pointers }
  58.   pb := PByte(S1);
  59.   pe := pb+Length(S1);
  60.   while pb<pe do begin   // Tiny improvement with pointers
  61.     // j := Ord(S1[i]);
  62.     j := pb^;
  63.     vpIndex := @F1SI[vcArIndex[j]];
  64.     Inc(vpIndex^);
  65.     if vpIndex<>@F1SI[0] then begin
  66.       inc(pb);
  67.       Continue;
  68.     end;
  69.     { Invalid charater }
  70.     Result := False;
  71.     if ExceptionOnError then
  72.       Error('S1', i);
  73.     Exit;
  74.   end;
  75.   F1SI[1] := High(SmallInt); // Do not fail due to deadchar's
  76.   { Decrease counts for S2 }
  77.   for i := 1 to Length(S2) do begin // No improvement with pointers
  78.     j := byte(S2[i]);
  79.     vpIndex := @F1SI[vcArIndex[j]];
  80.     Dec(vpIndex^);
  81.     if vpIndex^ >= 0 then  // All is well, processs next character
  82.       Continue;
  83.     { Invalid character or more counter become negative }
  84.     Result := False;
  85.     if ExceptionOnError and (vpIndex = @F1SI[0]) then
  86.       Error('S2', i);
  87.     Exit;
  88.   end;
  89.   F1SI[1] := 0;   // Ignore dead characters
  90.   for i := 0 to cFlagArraySizeInt - 1 do
  91.     if F1[i] <> 0 then
  92.       Exit(False);
  93.   Result := True;
  94. end;
                           
« Last Edit: October 25, 2024, 06:28:39 pm by BrunoK »

MathMan

  • Sr. Member
  • ****
  • Posts: 472
Re: Contest: fastest IsAnagram function
« Reply #86 on: October 25, 2024, 12:36:36 pm »
I fiddled a bit with my own solution and was about to post the new version. Now i see that it is nearly equivalent to the solution of BrunoK - only difference is that i precalculated the vcArIndex and put the positive/negative histogram into sub-functions.

So i withdraw my entry.

Cheers,
MathMan

silvercoder70

  • Full Member
  • ***
  • Posts: 200
    • Tim Coates
Re: Contest: fastest IsAnagram function
« Reply #87 on: October 25, 2024, 01:20:22 pm »
Take 2...
Code: Pascal  [Select][+][-]
  1. function IsAnagram_silvercoder70B(const S1, S2: String;
  2.                    IgnoreSpaces: Boolean = True;
  3.                    ExceptionOnError: Boolean = False): Boolean;
  4.  
  5.   function CharToByte(C: Char): Byte; inline;
  6.   const
  7.     UpperCase_A = Byte('A');
  8.     LowerCase_A = Byte('a');
  9.   begin
  10.     case C of
  11.        'A'..'Z':
  12.          Result := Byte(C) - UpperCase_A + LowerCase_A;
  13.      else
  14.        Result := Byte(C);
  15.      end;
  16.   end;
  17.  
  18. var
  19.   CharCount: array[Byte] of Integer;
  20.   i, j: Integer;
  21.   ByteIdx: Byte;
  22. begin
  23.   Result := False;
  24.   FillChar(CharCount, SizeOf(CharCount), #0);
  25.  
  26.   // update array based on S1...
  27.   for i := 1 to Length(S1) do
  28.   begin
  29.     if Byte(S1[i]) < 32 then
  30.     begin
  31.       if ExceptionOnError then
  32.         raise Exception.Create('Invalid character in S1: ' + S1[i]);
  33.       Exit;
  34.     end;
  35.     if (not IgnoreSpaces) or (S1[i] <> ' ') then
  36.       Inc(CharCount[CharToByte(S1[i])]);
  37.   end;
  38.  
  39.   // now checked elements in S2...
  40.   for j := 1 to Length(S2) do
  41.   begin
  42.     if Byte(S2[j]) < 32 then
  43.     begin
  44.       if ExceptionOnError then
  45.         raise Exception.Create('Invalid character in S2: ' + S2[j]);
  46.       Exit;
  47.     end;
  48.  
  49.     // Convert to lowercase and decrement count
  50.     if (not IgnoreSpaces) or (S2[j] <> ' ') then
  51.     begin
  52.       ByteIdx := CharToByte(S2[j]);
  53.       if CharCount[ByteIdx] = 0 then
  54.         Exit;
  55.       Dec(CharCount[ByteIdx]);
  56.     end;
  57.   end;
  58.  
  59.   for i := Low(Byte) to High(Byte) do
  60.   begin
  61.     if CharCount[i] > 0 then
  62.       Exit;
  63.   end;
  64.  
  65.   Result := True;
  66. end;              
  67.  
🔥 Pascal Isn’t Dead -> See What It Can Do: @silvercoder70 on YouTube

paweld

  • Hero Member
  • *****
  • Posts: 1568
Re: Contest: fastest IsAnagram function
« Reply #88 on: October 25, 2024, 02:06:57 pm »
Code: Pascal  [Select][+][-]
  1. function IsAnagram_paweld(const S1, S2: String; IgnoreSpaces: Boolean = True; ExceptionOnError: Boolean = False): Boolean;
  2. var
  3.   i: Integer;
  4.   carr: array [32..127] of Byte;
  5.   b: Byte;
  6. begin
  7.   Result := False;
  8.   FillChar(carr, SizeOf(carr), 0);
  9.  
  10.   for i := 1 to Length(S1) do
  11.   begin
  12.     b := ord(S1[i]);
  13.     case b of
  14.       32..64, 91..127: Inc(carr[b]);
  15.       65..90: Inc(carr[b + 32]);
  16.       else
  17.         if ExceptionOnError then
  18.           raise Exception.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  19.     end;
  20.   end;
  21.  
  22.   for i := 1 to Length(S2) do
  23.   begin
  24.     b := ord(S2[i]);
  25.     case b of
  26.       32..64, 91..127: Dec(carr[b]);
  27.       65..90: Dec(carr[b + 32]);
  28.       else
  29.         if ExceptionOnError then
  30.           raise Exception.CreateFmt('IsAnagram: illegal character in S1 at position %d',[i]);
  31.     end;
  32.   end;
  33.  
  34.   if IgnoreSpaces then
  35.     carr[32] := 0;
  36.  
  37.   for i := Low(carr) to High(carr) do
  38.   begin
  39.     if carr[i] <> 0 then
  40.       exit;
  41.   end;
  42.  
  43.   Result := True;
  44. end;  
Best regards / Pozdrawiam
paweld

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Contest: fastest IsAnagram function
« Reply #89 on: October 25, 2024, 02:07:23 pm »
Still slower here, why? I can see why your code should be faster but it isn't.
(and I didn't even try, yet)
« Last Edit: October 25, 2024, 02:23:56 pm by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

 

TinyPortal © 2005-2018