Recent

Author Topic: Proposal to improve IsLeapYear and EndOfAMonth-Functions  (Read 3735 times)

rvk

  • Hero Member
  • *****
  • Posts: 6208
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #30 on: February 28, 2024, 10:09:44 am »
rvk, yeah, i finally got the gist of how it works, because you enter the "mod 16" branch only on an even "century" (1600, 1700, 1800 etc.)
and mod 16 can be "translated" to a bitwise And
btw: What's the final conclusion?
At the least... mod 16 should be slightly (very slightly) faster than mod 400.
But I'm not sure if mod 16 will be optimized by the compiler.

and since i don't have access to the trunk/main source-code: could someone check how IsLeapyear is implemented there?
Because to me it's not making any sense, why with FPC3.2.2 "IsLeapYearV1/2" is faster than with the original in trunk
As stated in the last test project on this page:
Code: Pascal  [Select][+][-]
  1. function IsLeapYear(Year: Word): boolean;
  2. begin
  3.   Result := (Year mod 4 = 0) and ((Year mod 100 <> 0) or (Year mod 400 = 0));
  4. end;

BUT... that test project seems to have some big flaws.
It has a loop to high(testyears) but inside the loop does Func(testyears+3).
That's going to hit an invalid array index which throws off everything.

Making a correct test project is very important.
It shouldn't be dependent on random per function (so the last project does use an array to stuff the random years).

Josh

  • Hero Member
  • *****
  • Posts: 1305
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #31 on: February 28, 2024, 10:20:13 am »
Hi
Code: Pascal  [Select][+][-]
  1. Func(testyears[i]+3);
  2.     Func(testyears[i]-1);

just does the same test with 3 added to year,and again with -1 subtracted from year. does not change the array index
The best way to get accurate information on the forum is to post something wrong and wait for corrections.

rvk

  • Hero Member
  • *****
  • Posts: 6208
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #32 on: February 28, 2024, 10:22:15 am »
Hi
Code: Pascal  [Select][+][-]
  1. Func(testyears[i]+3);
  2.     Func(testyears[i]-1);

just does the same test with 3 added to year,and again with -1 subtracted from year. does not change the array index
Woops, sorry. I mist a bracket  :-[
(Thought the +3 was inside ;) )

I was thrown because you began the loop with 1 instead of low or 0.
« Last Edit: February 28, 2024, 10:24:21 am by rvk »

Josh

  • Hero Member
  • *****
  • Posts: 1305
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #33 on: February 28, 2024, 10:52:47 am »
oops should have done that.

do you get similar results using moded test prog to what i got, ie little to no difference on a 600 Million test?
The best way to get accurate information on the forum is to post something wrong and wait for corrections.

rvk

  • Hero Member
  • *****
  • Posts: 6208
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #34 on: February 28, 2024, 11:22:18 am »
do you get similar results using moded test prog to what i got, ie little to no difference on a 600 Million test?
My trunk is in the shop for a complete rebuild  ;)

While running that in the background I did a test with FPC 3.2.2.
I also did a test after that rebuild.

Free Pascal Compiler version 3.2.2 [2023/12/17] for x86_64

first try with running something on background:
Quote
LEAP YEAR TESTS...
  Loop only: 1.453 s
  FPC:       10.812 s, corrected: 9.359 s (100%)
  Variant 1: 13.719 s, corrected: 12.266 s (131%)
  Variant 2: 13.630 s, corrected: 12.177 s (130%)
second try without background process:
Quote
LEAP YEAR TESTS...
  Loop only: 1.158 s
  FPC:       9.280 s, corrected: 8.122 s (100%)
  Variant 1: 11.141 s, corrected: 9.983 s (123%)
  Variant 2: 11.358 s, corrected: 10.200 s (126%)

I find that very strange... variant 2 is slower without background noise (and faster with).

Then... trunk... and woopsie. @Thaddy... I got it again.
(see image)
Now to see how I report this and then make an exclusion.

Edit: Ok. Trunk with exclusion from virusscanner:
Quote
LEAP YEAR TESTS...
  Loop only: 1.250 s
  FPC:       3.074 s, corrected: 1.824 s (100%)
  Variant 1: 2.820 s, corrected: 1.570 s ( 86%)
  Variant 2: 2.812 s, corrected: 1.562 s ( 86%)
FPC 3.2.2 again with exclusion from virusscanner:
Quote
LEAP YEAR TESTS...
  Loop only: 1.196 s
  FPC:       9.819 s, corrected: 8.623 s (100%)
  Variant 1: 11.737 s, corrected: 10.541 s (122%)
  Variant 2: 12.010 s, corrected: 10.814 s (125%)
Not sure if with default distributed FPC 3.2.2 there is something different with optimization.
(I compiled both without any special options)
« Last Edit: February 28, 2024, 11:28:05 am by rvk »

rvk

  • Hero Member
  • *****
  • Posts: 6208
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #35 on: February 28, 2024, 11:41:12 am »
With -O4 FPC 3.2.2 gives:

Quote
LEAP YEAR TESTS...
  Loop only: 0.862 s
  FPC:       8.303 s, corrected: 7.441 s (100%)
  Variant 1: 12.405 s, corrected: 11.543 s (155%)
  Variant 2: 10.538 s, corrected: 9.676 s (130%)
 
My trunk with -O4:

Quote
LEAP YEAR TESTS...
  Loop only: 0.849 s
  FPC:       2.008 s, corrected: 1.159 s (100%)
  Variant 1: 1.011 s, corrected: 0.162 s ( 14%)
  Variant 2: 0.954 s, corrected: 0.105 s (  9%)

Over 9 seconds vs 0,105 seconds... wow. There must be something else at play here :)

Or has FPC gone though a major speed improvement ;)

cdbc

  • Hero Member
  • *****
  • Posts: 1172
    • http://www.cdbc.dk
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #36 on: February 28, 2024, 11:46:42 am »
Hi
Quote
Or has FPC gone though a major speed improvement ;)
I noticed the same in my post in this thread...#7  %)
The factor is _big_!
ps.: my tests was compiled on the commandline with just: "fpc project1.pp", for both trunk & 3.2.2
Regards Benny
« Last Edit: February 28, 2024, 11:50:15 am by cdbc »
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

Zvoni

  • Hero Member
  • *****
  • Posts: 2374
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #37 on: February 28, 2024, 11:56:07 am »
With -O4 FPC 3.2.2 gives:

Quote
LEAP YEAR TESTS...
  Loop only: 0.862 s
  FPC:       8.303 s, corrected: 7.441 s (100%)
  Variant 1: 12.405 s, corrected: 11.543 s (155%)
  Variant 2: 10.538 s, corrected: 9.676 s (130%)
 
My trunk with -O4:

Quote
LEAP YEAR TESTS...
  Loop only: 0.849 s
  FPC:       2.008 s, corrected: 1.159 s (100%)
  Variant 1: 1.011 s, corrected: 0.162 s ( 14%)
  Variant 2: 0.954 s, corrected: 0.105 s (  9%)

Over 9 seconds vs 0,105 seconds... wow. There must be something else at play here :)

Or has FPC gone though a major speed improvement ;)

I think we need more experienced people here (maybe PascalDragon) :D

The thing is: I did a speedtest on Excel/VBA 64-Bit (to satisfy my curiosity), and yeah, i know: not compiled code
Same algorithm ("classic" and "improved") with 200M iterations
and the "improved" version came out on top (though not with that big a difference)
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Thaddy

  • Hero Member
  • *****
  • Posts: 14657
  • Sensorship about opinions does not belong here.
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #38 on: February 28, 2024, 12:08:18 pm »
Note year mod 4 can be optimized to year and 3 since it is a power of two. Same year mod 16 = year and 15
(Maybe the compiler already does that)
« Last Edit: February 28, 2024, 12:13:50 pm by Thaddy »
bitrate is always calculated like this:sample rate * bitdepth * number of channels.

cdbc

  • Hero Member
  • *****
  • Posts: 1172
    • http://www.cdbc.dk
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #39 on: February 28, 2024, 12:55:37 pm »
Hi
@Thaddy:
Quote
(Maybe the compiler already does that)
Would that account for the huge speed improvement I/We see between 3.2.2 and trunk 3.3.1?!?
My trunk was compiled on 29.01.2024 and it's absolutely staying  :D
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

Thaddy

  • Hero Member
  • *****
  • Posts: 14657
  • Sensorship about opinions does not belong here.
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #40 on: February 28, 2024, 01:16:05 pm »
In trunk there are a lot of such optimizations, but I believe the mod > and optimization is much older.
I just looked and trunk indeed substitutes mod with and for modulo power of 2.

And such subsitution is indeed much faster than mod on most cpu's.
« Last Edit: February 28, 2024, 01:18:50 pm by Thaddy »
bitrate is always calculated like this:sample rate * bitdepth * number of channels.

Zvoni

  • Hero Member
  • *****
  • Posts: 2374
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #41 on: February 28, 2024, 01:57:20 pm »
In trunk there are a lot of such optimizations, but I believe the mod > and optimization is much older.
I just looked and trunk indeed substitutes mod with and for modulo power of 2.

And such subsitution is indeed much faster than mod on most cpu's.
So, we're all agreed, the "rewrite" of IsLeapYear to first check mod 100 and then branch to mod 4 resp. mod 16 is actually an improvement?
Besides the fact, that we cannot prove it consistently?

Because as i said earlier: The Performance of the new "MonthDays" doesn't make sense, since it is using IsLeapYear.

Or is the algorithm of the new MonthDays so fast, it compensates the (still not understood) lower performance of the new IsLeapYear?
« Last Edit: February 28, 2024, 02:27:44 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Zvoni

  • Hero Member
  • *****
  • Posts: 2374
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #42 on: February 28, 2024, 03:32:40 pm »
Hmmm.....i was just reading up on Branch prediction (also as mentioned in the video - see link in post 1)

Let's take a Timespan of 400 consecutive years (1600-1999)

In the "original" FPC-Version, in 75% of executions (300 execs) the left operand of the "and" short-circuits the expression (right hand side doesn't get executed)
Meaning: in 25% of executions (100 execs), the CPU has to rollback the pipeline, because it took a wrong turn.
And within those 25% there is another Branch the CPU is going to predict correct in 96% cases, but i think that one is negligible
(the left operand of the "or" returns 96 times "True" within a 400-year timespan, because it only gets checked 100 times)

How "expensive" is such a rollback? I'm aware it depends on the CPU and its capabilities, but more in general terms?

Because (irrespective of the "huge" differences in the testresults above), with the "improved" IsLeapYear the CPU takes a wrong turn only in 1% of cases (99% of times the CPU always predicts the correct branch without a lingering second branch nested there)

Question to those who have tested here: Would it be possible for you to test on different machines/architectures (ARM, MAC, different OS etc.)?
Of course, only if possible
« Last Edit: February 28, 2024, 03:42:10 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

BrunoK

  • Sr. Member
  • ****
  • Posts: 452
  • Retired programmer
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #43 on: February 28, 2024, 05:49:58 pm »
I have the impression that the benchmark is biased due to the fact that IsLeapYear used is called in the PPU's that were compiled with FPC and, hence, the code generated and timings for the benchmark base IsLeapYear does not match the optimization of the revised alternatives.

Also having the RANDOM generation inside the benchmark loop is likely to have some influence (how large ?) on the timing results.

I tested :
Code: Pascal  [Select][+][-]
  1. function IsLeapYear3(Year: word): boolean;
  2. begin
  3.   If Year mod 100<>0 Then
  4.      Result:=(Year and 3)=0
  5.   Else
  6.      Result:=(Year and 15)=0;
  7. end;
that seems to be faster.

Attached in zip wp's program with pre initialization of Random values and copy of IsLeapYear to put the leap year functions on the same compiler optimization level.

Thaddy

  • Hero Member
  • *****
  • Posts: 14657
  • Sensorship about opinions does not belong here.
Re: Proposal to improve IsLeapYear and EndOfAMonth-Functions
« Reply #44 on: February 28, 2024, 06:08:16 pm »
Also having the RANDOM generation inside the benchmark loop is likely to have some influence (how large ?) on the timing results.
Did they really do that? I missed that gotspe.
Also nobody is testing the compiler, just Intel cpu targeted code..... which is even worse, Test the compiler, and for multiple cpu's. Not just Intel/AMD64
« Last Edit: February 28, 2024, 06:16:09 pm by Thaddy »
bitrate is always calculated like this:sample rate * bitdepth * number of channels.

 

TinyPortal © 2005-2018