* * *

Author Topic: Integer Performance Test: Delphi 10.2 Tokyo outperforms Visual C++ and Visual C#  (Read 3717 times)

srcstorm

  • New member
  • *
  • Posts: 21
As you know Delphi 10.2 Tokyo has been released recently. An author of codingforspeed.com had a pre-release version of Delphi and published results of an integer performance comparison test some time ago:
https://codingforspeed.com/integer-performance-comparison-for-c-c-delphi

According to this test, latest version of Delphi performs better than Visual C++ on Win64 target. The test measures both single-thread and multi-thread performance, and Delphi has significant lead in both tests.

Since then the official release has been announced, and I installed Starter Edition. The source codes of this test are on Github, so I decided to give it a try. Meanwhile Visual Studio 2017 also released. Now we can see if C++ 2017 has any improvements.

Test setup:
RAD Studio 10.2 Starter, Delphi Version 25.0.26309.314
>> Supports only Win32 target.
Visual Studio Community 2017, Version 15.0.0+26228.9, .NET Framework Version 4.6.01586
>> Supports Win32 and Win64 targets for C++, only Any CPU for C#.
Lazarus 1.6.4 64-bit, SVN Revision 54278
>> Supports Win64 target, Win32 target needs add-on.

Desktop PC:
Windows 10 64-bit, Build 1607
AMD A10-7860K@3.8 GHz
16 GB 2133 MHz DDR3 RAM


I tried to use latest versions of development environments. On particular CPU I am using I got these results:

--- Any CPU ---
C# Serial 11344
C# Parallel 2921

--- Win32 ---
C++ Serial 11656
C++ Parallel 2937, 2984
Delphi Serial 11406
Delphi Parallel 3094, 2890, 2969
Lazarus Serial 11750
Lazarus Parallel 4656

--- Win64 ---
C++ Serial 11687
C++ Parallel 2922, 2969
Delphi Trial Serial 11203
Delphi Trial Parallel 3157, 2828, 2922
Lazarus Serial 11359
Lazarus Parallel 4250

Conclusion:
In Delphi, there were 3 different methods to implement concurrency for the tested calculation. In C++, the author came up with 2 methods. Although Delphi Starter doesn't support Win64 target, Win32 test alone showed a very promising result. Lazarus is also following Delphi closely. Parallel code didn't work in Lazarus so I only did serial test, and it is 2.8% faster than C++ on Win64 target. I attached the Lazarus code I used.

You can also post your test results, especially if you have 64-bit Delphi, so we can have a better idea of latest situation. If you like you can post results of other benchmarks like SciMark too. It would be nice if we have a broader perspective of how latest compiler versions are performing.

When it comes to performance, Pascal compilers compete each other. There is no other competition ;)

Edit:
Using the code suggested by ykot, multi-threading test for Lazarus was performed and results were updated. Compared to other products, the latest stable version of Lazarus has mediocre parallel processing speed.

Edit2:
I discovered that RAD Studio 10.2 Trial has Delphi Win64 compiler, so I added results for Delphi Win64.

You can download Delphi 10.2 Tokyo Starter here, it doesn't have a time limitation but it supports only Win32 target:
https://www.embarcadero.com/products/delphi/starter/promotional-download

RAD Studio 10.2 Tokyo Architect Trial includes Delphi Win64 compiler and all cross-compilers:
https://www.embarcadero.com/products/rad-studio/start-for-free

« Last Edit: March 30, 2017, 11:16:53 am by srcstorm »

Thaddy

  • Hero Member
  • *****
  • Posts: 4007
What were your compiler settings?

FPC has more options and when I ran the test compiled with:

fpc -CX -XXs -Sv -CfSSE41 -CpATHLON64 -OpATHLON64 -Mobjfpc -OoFASTMATH -O4  benchint.lpr

It was another 4% faster compared to the default fpc -CX -XXs -Mobjfpc -O2 benchint.lpr

On my (very slow AMD E-2500) laptop FPC did actually better (23844) than Berlin (24719)  8-)

« Last Edit: March 27, 2017, 12:31:32 pm by Thaddy »
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

srcstorm

  • New member
  • *
  • Posts: 21
@Thaddy,

This is not an FPC test. Only Lazarus is tested. Modern IDEs have Debug and Release profiles. We switch to Release mode on each IDE. No other setting is modified. So this is a test of what you get "out of the box".

Thaddy

  • Hero Member
  • *****
  • Posts: 4007
@srcstorm

It is not a Lazarus program.
Lazarus uses the FPC compiler.
These compiler settings can all be set in Lazarus. What a stupid remark. >:D >:D

The performance comes from the compiler, not from Lazarus. You are testing performance, NOT an editor.
« Last Edit: March 27, 2017, 01:22:10 pm by Thaddy »
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 5532
I never had a practical use for paralel for in ten years of programming. (while my apps are generally multithreading).

I always wonder a bit why people think it is so great? Anybody have real world examples?


HeavyUser

  • Full Member
  • ***
  • Posts: 118
I never had a practical use for paralel for in ten years of programming. (while my apps are generally multithreading).

I always wonder a bit why people think it is so great? Anybody have real world examples?
google search? the human genome project? SETI?

Thaddy

  • Hero Member
  • *****
  • Posts: 4007
Yeah, but you can donate a RPi for that and let it do it's job....
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

ykot

  • Full Member
  • ***
  • Posts: 136
srcstorm, you should really be testing Delphi's TParallel against C++11 threads with lambdas, since that's what it is, not against  OpenMP or other similar extensions, much less running it through CLI.

Also, any chance of throwing a comparison with FreePascal version compiled with -O4 to the mix?

Thaddy

  • Hero Member
  • *****
  • Posts: 4007
Also, any chance of throwing a comparison with FreePascal version compiled with -O4 to the mix?
No, because he thinks Lazarus IS a compiler. Well, we all know that that's not the case....
And YES, because I did... 8)
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

Leledumbo

  • Hero Member
  • *****
  • Posts: 7608
  • Programming + Glam Metal + Tae Kwon Do = Me
I never had a practical use for paralel for in ten years of programming. (while my apps are generally multithreading).

I always wonder a bit why people think it is so great? Anybody have real world examples?
It's more to ease of use, I guess. "Modern" programmers think that managing threads (or processes, don't really care the backend) manually is cumbersome and time consuming, so if they have built-in solution they will prefer that regardless its overhead whatsoever they can't control, at least that's what my CTO thinks.

ykot

  • Full Member
  • ***
  • Posts: 136
Made a benchmark in a real hurry of Delphi 10.1 x64 vs Visual Studio 2015 x64. Visual Studio x64 is actually faster.  (running inside VM with 8 processors enabled on Linux Host, Core i7 6700K, 2400 Mhz DDR4 RAM, VMWare Player)

(edit) Updated the post, taking the source code for calculating prime numbers from this GitHub page, that has been taken from OP article. This actually increases the gap between Delphi and MSVC, now the difference is bigger.

(edit2) Added FreePascal implementation using parallel procedures. The loop function, however, is not inlined so am not sure if this is the optimal approach, likely there is a better way of doing it.

I've executed each sample application 4 times and produced average times, the output is seen on screenshot (if you're not logged, attachments don't seem to show up).

Delphi project used "Release" (optimizations on), MSVC used /Ox /GL, FreePascal used -O4 -OoLoopUnroll -Sv.
Results so far:

Delphi 10.1 x64: 968.75 ms
FreePascal 3.1.1 (trunk): 2679.75 ms
Visual Studio 2015 x64: 721.25 ms

So MSVC seems to be around 34% faster than Delphi, whereas FreePascal source seems to have some room for optimizations (please feel free to adjust the source code).

I've updated attachments, including latest sources. Original post is quite surprising because Delphi's native compiler is getting rather old. I suppose that even more complex benchmarks would actually increase the gap. My guess is that OP has compared Delphi's TParallel test code (which is just a wrapper for native threads) against actual parallel languages, which provide much greater flexibility at the expense of some minimal overhead. This is not a fair comparison.
« Last Edit: March 28, 2017, 05:58:58 am by ykot »

srcstorm

  • New member
  • *
  • Posts: 21
Made a benchmark in a real hurry of Delphi 10.1 x64 vs Visual Studio 2015 x64.

Delphi 10.1 and Visual C++ 2015 are things of the past. Did you steal them from a museum? Still, I adopted your code and updated first post.

The Lazarus project I used for multi-threading test is in the attachment.


Blestan

  • Sr. Member
  • ****
  • Posts: 384
things from the past????????
delphi 10.2 tokyo is 10 days old:))))
it's very important that you learn to read and understand the readed text  before you start to post/write :)))
hahahahah
Speak postscript or die!
Translate to pdf and live!

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 4284
    • wiki
@srcstorm

It is not a Lazarus program.
Lazarus uses the FPC compiler.
These compiler settings can all be set in Lazarus. What a stupid remark. >:D >:D

The performance comes from the compiler, not from Lazarus. You are testing performance, NOT an editor.

You clearly mis-read his comment.

ykot

  • Full Member
  • ***
  • Posts: 136
I have written a FreePascal version of the test, but using native threads directly and recompiled same application with Delphi. Tests conditions, configuration and target is the same. I have attached updated sources for FreePascal, Delphi and MSVC. If you recompile each of them, please make sure to enable all optimization options: Release mode for Delphi, "-OoLoopUnroll -OoFastMath -Sv -CpCoreI -CfSSE42 -OpCoreI" for FreePascal and "-Ox -GL" for MSVC.

Still, I'm getting the following figures for x64 target:

FreePascal (native): ~2400 ms
FreePascal (MTProcs): ~2840 ms
Delphi (native): ~1030 ms
Delphi (TParallel class): ~1010 ms
MSVC: ~710 ms

Out of curiousity, for 32-bit target:

FreePascal (native): ~1030 ms
Delphi (TParallel class): ~1030 ms
MSVC: ~730 ms

Srcstorm, I'm not sure how really you are compiling the projects, but your benchmarks seem to be rather bogus - in both 32-bit and 64-bit tests, Delphi is roughly 50% slower than the corresponding Visual Studio compiled project in both Win32 and Win64 targets. Also, I doubt there have been any changes to Win32/Win64 Delphi compilers in Delphi 10.2 (in fact, likely since the release of Delphi XE 2), so performance tests are likely the same for both Delphi 10.1 and 10.2.

However, I still don't understand why FreePascal version is much slower for x64 target, even when using threads directly via "TThread" class. Perhaps the issue is actually in how "IsPrime" function gets compiled?
« Last Edit: March 28, 2017, 08:31:43 pm by ykot »

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus