That is why I am asking someone to test it under Linux. Windows does not use unit threads.
Can you please move mtprocs to the end of the uses clause and try again ?
3968050
Pfannkuchen(12) = 65
Time : 9.239
64-bit Manjaro Linux, kernel 4.20.15, ASUS ROG GL503VD i7-7700HQ 16GB DDR4 2400MHz.A few days ago, Akira1346 managed to bring fpc to the top of score in the Binary Trees benchmark, using the PasMP multiprocessing library. I might try it, but doubt it would help for fannkuch, which merely runs a few static threads over the entire program runtime.I don't think that this will help because PasMP provides locking methods which are useful when several threads are working concurrently on the same object. But seems this is not the case on your example...
So I guess the reason for the lower performance is ultimately fpc's code generation / optimization as such. Probably a bit more on the conservative side than C.Guess its because OpenMP generates vector asm code and other things for the for-loops which are not supported in FPC or at least not in the used FPC 3.0.4 they used in benchmarks.