and it seems that you did not even read my post in full
I carefully read all posts including yours.
But in your posts I see only assurances of your alleged superpower to think better than the compiler.
Perhaps you are such an optimizing superhero, but you haven't presented any evidence yet.
I 'm not a optimization superhero, I'm just a regular programmer with some experience in C/C++, Java, C# & Delphi/Lazarus, x86 Asm, and so on.
But unlike you I was not lazy to download scimark2 once again and even found the Delphi version for you.
scimark2 (Java & C):
http://math.nist.gov/scimark skimark2 (Delphi port):
http://code.google.com/p/scimark-delphi/It's pretty simple code there, no object approach. All actions are extremely simple and C style.
Again, all algorithms and data structures are extremely similar for all languages just to see how compilers will manage it.
I was not even lazy to add the "inline" specifier for most functions in Delphi sources and replace Integer in some places with NativeInt (in case of Delphi it can speed up even simple loops a bit, but for Lazarus - I don't know).
The code compiled by VC++2022 with optimization options /O2 and /GL gave this results:
Composite Score: 4902.81
FFT Mflops: 3969.82 (N=1024)
SOR Mflops: 3315.19 (100 x 100)
MonteCarlo: Mflops: 1832.32
Sparse matmult Mflops: 5375.93 (N=1000, nz=5000)
LU Mflops: 10020.80 (M=100, N=100)
The code compiled by Lazarus 4.0RC2 with maximum optimization level -O4 and turned off all Checks and Assertion gave this results:
Composite Score MFlops: 2694,44
FFT Mflops: 2301,68 (N=1024)
SOR Mflops: 2054,35 (100 x 100)
MonteCarlo: Mflops: 404,23
Sparse matmult Mflops: 4503,95 (N=1000, nz=5000)
LU Mflops: 4207,98 (M=100, N=100)As everyone can see VC++ created code that executes almost twice as fast as the code created by Lazarus.
And the worst results Lazarus show on MonteCarlo benchmark - only 404,23 vs 1832.32 (VC++) -> 4.5 times slower.
I was curious to why it so and was not too lazy to look at the assembly code for MonteCarlo benchmark.
In short VC++ has made most procedures as inline even though it was not asked to do so.
Whereas Lazarus left half of the procedures marked as inline as callable procedures (obviously oriented on their size).
In addition VC++ rethought the sequence of actions in some procedures (for example, in the Random_nextDouble procedure), while Lazarus clearly did everything as it was told in the source.
As a result, we got a code that works in this particular algorithm 4.5 times faster actually from the same sources.
I already anticipate in advance that you will once again tell us a fairy tale about how you will rewrite all this yourself (probably even in assembly language) and everything will work even faster.
But it is much easier for us "humble programmers" (not a fairy tale optimization superheroes as you) to entrust this time critical algorithms to compilers like VC++
And believe me, my opinion is based not only on money, as you wrote.
Although, in our solutions signal processing time is also can be money for the clients.