Lazarus

Programming => General => Topic started by: Akira1364 on March 08, 2019, 12:32:41 am

Title: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 08, 2019, 12:32:41 am
https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html (https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html)

I made a modified version of the multi-threaded version originally posted in this thread by Nitorami:
http://forum.lazarus.freepascal.org/index.php?topic=39935.0 (http://forum.lazarus.freepascal.org/index.php?topic=39935.0)

and submitted it after clarifying with them that it would be ok to do so.

Edit: as I've stated in another comment below, I came up with a further-modified version after this one and sent it in, and FPC is now at first place!
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: dbannon on March 08, 2019, 02:12:39 am
Way, way up  indeed !

Nice One !

 :D

Davo
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Thaddy on March 08, 2019, 09:26:02 am
Looking at the cores there is still a bit room to improve even a bit more. Good job. Have you tried installing another memory manager as well? Also note there are some nice x86_64 optimizations in trunk.
I have a feeling that many more "competition" examples for FPC are less than optimal.
Again: good job!
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: silvestre on March 08, 2019, 11:15:00 am
¬°Freepascal power ;) !

 It would be great to add the latest optimizations in compilation...The first position is very near! Great work and good press for this community.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: marcov on March 08, 2019, 11:21:51 am
Looking at the cores there is still a bit room to improve even a bit more. Good job. Have you tried installing another memory manager as well? Also note there are some nice x86_64 optimizations in trunk.
I have a feeling that many more "competition" examples for FPC are less than optimal.
Again: good job!

Most work was afaik done by eg. neli and some others before they switched it to allow explicit threading.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: BrunoK on March 08, 2019, 11:43:16 am
Tried to run the code as published in https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/binarytrees-fpascal-3.html,   

gives me :
Code: Pascal  [Select]
  1. Project BinaryTrees raised exception class 'External: SIGSEGV'.
  2.  
  3.  In file 'BinaryTrees.pas' at line 66:
  4. MakeTree(Depth, MP), MP),
  5.  

Linux :
  DISTRIB_DESCRIPTION="Linux Mint 19.1 Tessa"   

Windows gives the same error.

What can be wrong in my installation ?
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Thaddy on March 08, 2019, 11:51:24 am
Mode?....< that's a bit of a sigh, I may change this to  grumpy mode >:D when that is the case....>
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: BrunoK on March 08, 2019, 12:14:12 pm
@Grumpy.

Did you actually compile and run the program.

If mode delphi is not specified it doesn't compile because of AdvancedRecords usage.

Still waiting for FreeAndNil badness article.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 08, 2019, 02:54:08 pm
If mode delphi is not specified it doesn't compile because of AdvancedRecords usage.

I originally had

Code: Pascal  [Select]
  1. {$mode ObjFPC}
  2. {$modeswitch AdvancedRecords}

in there. The Benchmarks game maintainer guy doesn't like in-source compiler directives for some reason though, so he changed it to use

Code: [Select]
-Mdelphi
instead. The -MDelphi is visible in the output at the bottom of the page, however.

What command line argument were you using with it to indicate

Code: [Select]
MaxDepth
though?

It would be great to add the latest optimizations in compilation...

I agree, but I doubt it would be possible to convince the maintainer to switch to an arbitrary FPC trunk build. So hopefully the next "stable" version comes out soonish...
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: BrunoK on March 08, 2019, 04:08:32 pm
@Akira1364  02:54:08 pm
Thank you. I had missed the parameter 21 (ParamCount). So it used the default 32.

Interesting "benchmark".

A case where Linux seems much faster than windows.

As in general -O1 + -OoREGVAR seems to be a satisfying optimization compromize.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 08, 2019, 04:17:26 pm
A case where Linux seems much faster than windows.

I actually consistently get around 1.8 seconds or so with an input of 21 on both Linux and Windows, using the same hardware (Haswell i7-4770k CPU.) This is with native 64-bit FPC for both, to be clear.

Note also that the machine the benchmarks run on for the site is somewhat outdated... (Core 2 Q6600 CPU.)
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Nitorami on March 08, 2019, 05:38:32 pm
@Akira: Thanks for submitting this, I guess I would have dragged it on forever...

A  note on the inflationary use of the inline modifier: It really only makes sense for very short leaf functions. Certainly not for MakeTree, which is used recursively and cannot be inlined in the first place.

The benefit of inlining on moden processors is dubious to me anyway. Sometimes it even makes code slower; this probably has to do with how the CPU optimises the program flow.

@Thaddy
Quote
I have a feeling that many more "competition" examples for FPC are less than optimal.

Not sure. I have tinkered with a few, and could not do a lot. An example is fannkuch-redux, where I may get 10%...20% off. The problem is that the effect of changes is often not predictable, it's try and error, and the result may be different on different systems.
The core of fannkuch is the flip function. This is another example where you get faster code without inlining. Putting some local variables on the heap rather than the stack also results in a small but disctinct improvement, but only on win64; on win32 the effect is exactly opposite. On Linux, it may again be different (cannot test this).
Thus, even after a lot of manual optimization, we cannot really tell how the code will perform on a different target system. The gcc compiler may simply be better in producing the best code on each specific system.... That said, I am talking about really small effects here, 10-20% up or down, not too far from the performance of gcc, and irrelevent for a real world program.

What's really poor is FPC's performance in the Mandelbrot benchmark, factor ten slower than gcc, and also much slower than many others, Rust, Swift, C#, .NET, Java, even LISP and Haskell. And it cannot be improved - at least I did not manage it - simply because FPC lacks support of vector processing, and gives away factor 8 in performance.




Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 08, 2019, 05:59:17 pm
@Akira: Thanks for submitting this, I guess I would have dragged it on forever...

No problem!

A  note on the inflationary use of the inline modifier: It really only makes sense for very short leaf functions. Certainly not for MakeTree, which is used recursively and cannot be inlined in the first place.

People say this often, but I'm not sure I really agree in many cases. The thing to keep in mind is that inlining does not actually mean just literally copying and pasting the assembler generated for the function body into different places.

In my experience, a lot of the time (perhaps most of the time), an inlined function looks absolutely nothing like what was generated for the standalone body as it gets optimized in the particular context (i.e. combined with what surrounds it and such.)

This often actually not only improves speed but results in smaller, not larger, executables overall I've found.

That said, about the MakeTree thing, yeah, I noticed it was pointless after I submitted it but it doesn't really matter either way as it just does nothing.

The benefit of inlining on moden processors is dubious to me anyway.

I really don't think it is. IMO FPC is way too conservative about it, if you compare what it typically generates (or even attempts to generate) with regards to inlining to just about any other "native code" compiler.

Especially the fact that "compilerprocs" (even ones not written in assembly) are never ever inlined is rather less than ideal, in my mind.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 08, 2019, 09:18:31 pm
Ok, so, FPC is now number one! I came up with another revised version, and submitted it earlier today.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html (https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html)

One thing I noticed when making this second version, also: recursive functions are in fact inlinable, and can even inline themselves.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: howardpc on March 08, 2019, 09:35:41 pm
Kudos to Akira1364 and FPC developers!
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: marcov on March 08, 2019, 09:37:33 pm
Congratulations :_)
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: silvestre on March 08, 2019, 09:59:19 pm
Great merit. Congratulations to you and to all those who make Freepascal+lazarus possible! :o :o :o :)


Ok, so, FPC is now number one! I came up with another revised version, and submitted it earlier today.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html (https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html)
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: ASBzone on March 08, 2019, 10:52:26 pm
Kudos to Akira1364 and FPC developers!

 ;D
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Nitorami on March 09, 2019, 09:11:41 am
It moved to place 5. Die they re-run the test ?
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Thaddy on March 09, 2019, 09:28:46 am
@Grumpy.

Did you actually compile and run the program.
Of course I did, hence I knew it was the mode..... Bad eyes?
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Thaddy on March 09, 2019, 09:33:58 am
Well, the changes did improve it, but the cores are less stressed now (compared to yesterday). So there's a lot of improvement to make: the cores should reach close to 100% each.
32% 82% 91% 92% ? That's the lowest (core 32%) load from the first 10 at least, not average load, .... Means there's really much to gain.
Also: compared to yesterday the mem use has gone up, which suggests a MM change? Pity there's no way to see what caused this anomaly. Yesterday one of the cores ran at 100% I believe.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Akira1364 on March 09, 2019, 05:48:52 pm
It was my fault, I was playing around with submitting some more alternate versions last night.

I had him put the original first-place FPC one back up now along with the other one, and they're both at the very top again. So we should be ok, haha.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html (https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html)

I've learned from Bero (author of PasMP) that the particular way I'm using the ParallelFor callback is likely to be unstable though, so I'm going to try to make one that is still the fastest, but will also not fluctuate as much every time the benchmark is run. Probably not for a few days though.
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: Thaddy on March 10, 2019, 10:29:06 am
Nice to see it is currently the very top...
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: BeniBela on March 10, 2019, 11:48:53 am
Now optimize  all the other scores (https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/fpascal-gpp.html)
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: silvestre on March 10, 2019, 01:32:23 pm
For example, Fasta position 52!

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fasta.html (https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fasta.html)

Now optimize  all the other scores (https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/fpascal-gpp.html)
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: VTwin on March 11, 2019, 07:52:52 pm
Very cool!
Title: Re: FPC's Binary Trees score for "Benchmarks Game" just went way up! :)
Post by: igouy on March 14, 2019, 04:07:30 pm
Now optimize  all the other scores (https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/fpascal-gpp.html)

That's something you can try to do, yourself.