Recent

Author Topic: DivMod optimization  (Read 2577 times)

gidesa

  • Full Member
  • ***
  • Posts: 156
DivMod optimization
« on: January 04, 2025, 12:26:46 pm »
Hello,
seems that DivMod is not optimized against a constant divisor, on contrary separate Div and Mod are optimized.
See the following code (from this interesting site: https://github.com/jabbalaci/SpeedTests).
Using DivMod on my machine performance is around 3 times slower than using separate Div and Mod (around 6 sec).

EDIT: indeed the difference is on FPC 3.2.2 and 3.3.1 using UInt32; using Integer there is no difference
Code: Pascal  [Select][+][-]
  1. program testMunchausen;
  2. uses math;
  3.  
  4. procedure CalcCache(var cache: array of Uint32);
  5. var
  6.   i: Uint32;
  7. begin
  8.   cache[0] := 0;
  9.   for i := 1 to 9 do
  10.   begin
  11.     cache[i] := i**i;
  12.   end;
  13.  
  14. end;
  15.  
  16. function IsMunchausen(constref cache: array of Uint32; constref num: Uint32): Boolean;
  17. var
  18.   n, total, digit: Uint32;
  19. begin
  20.   n := num;
  21.   total := 0;
  22.   while n > 0 do
  23.     begin
  24.       //DivMod(n, 10, n, digit);
  25.       // splitted in 2 separate operations
  26.       digit := n MOD 10;
  27.       n:= n DIV 10;
  28.       total += cache[digit];
  29.       if total > num then
  30.         break;
  31.     end;
  32.   IsMunchausen := num = total;
  33. end;
  34.  
  35. var
  36.   cache: array[0..9] of Uint32;
  37.   i: Uint32;
  38. const
  39.   MAX = 440000000;
  40. begin
  41.   CalcCache(cache);
  42.   for i := 0 to MAX do
  43.     if IsMunchausen(cache, i) then
  44.       WriteLn(i);
  45. end.                                                        
  46.  
« Last Edit: January 04, 2025, 01:28:26 pm by gidesa »

ALLIGATOR

  • New Member
  • *
  • Posts: 49
Re: DivMod optimization
« Reply #1 on: January 04, 2025, 02:22:11 pm »
Try this, it's how I was able to maximize optimization

Code: Pascal  [Select][+][-]
  1. function IsMunchausen(const cache: array of LongInt; num: UInt32): Boolean;
  2. {$CODEALIGN JUMP=128}
  3. var
  4.   n, tn, total, digit: UInt32;
  5. begin
  6.   n := num;
  7.   total := 0;
  8.  
  9.   while n > 0 do
  10.     begin
  11.       tn:=n div 10;
  12.       digit:=n-tn*10;
  13.       n:=tn;
  14.  
  15.       total += cache[digit];
  16.       if total > num then
  17.         break;
  18.     end;
  19.   IsMunchausen := num = total;
  20. end;

gidesa

  • Full Member
  • ***
  • Posts: 156
Re: DivMod optimization
« Reply #2 on: January 04, 2025, 02:45:53 pm »
Thank you. With your function the program is slightly faster, around 5,5 sec (Win 11, AMD Ryzen 7 2700).
I use your suggestion to modify  original program.
Now all variables are type Integer, and in function I don't use MOD:
Code: Pascal  [Select][+][-]
  1. function IsMunchausen(constref cache: array of integer; constref num: integer): Boolean;
  2. var
  3.   n, total, digit: integer;
  4.   tn: integer;
  5. begin
  6.   n := num;
  7.   total := 0;
  8.   while n > 0 do
  9.     begin
  10. //      DivMod(n, 10, n, digit);  
  11. //      digit := n MOD 10;
  12.       tn:= n DIV 10;
  13.       digit:=n-tn*10;
  14.       n:=tn;
  15.       total += cache[digit];
  16.       if total > num then
  17.         break;
  18. //      n:= n DIV 10;
  19.     end;
  20.   IsMunchausen := num = total;
  21. end;                                
  22.  

Now time is 6,5 -7 sec, much better than original version with Integer (17 sec), although a bit worst that UInt32 version.
So seems that DivMod and Mod are not always optimized.

ALLIGATOR

  • New Member
  • *
  • Posts: 49
Re: DivMod optimization
« Reply #3 on: January 05, 2025, 08:23:34 am »
Note the results for the Go language - there are two variants shown there with Int32 and UInt32

gidesa

  • Full Member
  • ***
  • Posts: 156
Re: DivMod optimization
« Reply #4 on: January 05, 2025, 02:27:14 pm »
I think the initial idea for that benchmark is to compare languages using the most simple and straightforward implementation, with no special tricks, using the C source as reference. And using different compilers, if avalaible.
But you are right, for Go, Haskell, and others, there are multiple source version.
I have open an issue, asking to update the Freepascal original source. Or add a second version.
Your suggestions are very useful, the new source is more than 3 times faster.
Sadly remains the fact that DivMod and Mod are not very optimized. Here others compilers do a better work.

ALLIGATOR

  • New Member
  • *
  • Posts: 49
Re: DivMod optimization
« Reply #5 on: January 05, 2025, 02:39:27 pm »
You are not quite right - div and mod are optimized, but separately, and DivMod is a normal function, it is optimized compared to two separate actions of dividing and finding the remainder from dividing by a value in a variable (not a constant)

Also you can always do it like this:
 :)
Code: Pascal  [Select][+][-]
  1. program test;
  2. {$mode objfpc}
  3.  
  4. uses math;
  5.  
  6. procedure DivMod_(Dividend: LongInt; Divisor: LongInt; var Result, Remainder: LongInt); inline;
  7. begin
  8.   if IsConstValue(Divisor) then
  9.   begin
  10.     Result:=Dividend div Divisor;
  11.     Remainder:=Dividend mod Divisor;
  12.   end else
  13.   begin
  14.     DivMod(Dividend, Divisor, Result, Remainder);
  15.   end;
  16. end;
  17.  
  18. var
  19.   a,b,c: LongInt;
  20.  
  21. begin
  22.   a:=333+Random(0);
  23.   DivMod_(a, 33, b, c);
  24.   WriteLn(a,',',b);
  25.   ReadLn;
  26. end.

Laksen

  • Hero Member
  • *****
  • Posts: 787
    • J-Software
Re: DivMod optimization
« Reply #6 on: January 05, 2025, 03:03:11 pm »
As a microoptimization you can remove constref from num

gidesa

  • Full Member
  • ***
  • Posts: 156
Re: DivMod optimization
« Reply #7 on: January 05, 2025, 03:26:20 pm »
You are not quite right - div and mod are optimized, but separately, and DivMod is a normal function, it is optimized compared to two separate actions of dividing and finding the remainder from dividing by a value in a variable (not a constant)

Indeed if we simply don't use DivMod and Mod, as in my IsMunchausen() last version, performance are 3 times better!  So Mod seems not optimized in respect of dividing for a constant, where Div is optimized. For DivMod really I don't know was the developers target.
In my opinion, in this simple test, we are only comparing "native" language/compiler optimization performances.
Of course it's always possible to rewrite functions, add tricks specific of a language, etc., for every compared language/compiler.

gidesa

  • Full Member
  • ***
  • Posts: 156
Re: DivMod optimization
« Reply #8 on: January 05, 2025, 03:39:50 pm »
As a microoptimization you can remove constref from num

Thank you, there is another significative improvement.

 

TinyPortal © 2005-2018