Recent

Author Topic: FPC for high-performance computing  (Read 6857 times)

gues1

  • Jr. Member
  • **
  • Posts: 77
Re: FPC for high-performance computing
« Reply #30 on: June 01, 2025, 10:09:03 pm »
Enter number of threads: 16
Total steps: 11112
Execution time: 1.67 seconds
Press Enter to exit...

With Lazarus, forcing the use of P Core with (I have 8 PCore => 16 THREADS):

Code: Pascal  [Select][+][-]
  1.     SetProcessAffinityMask(getcurrentProcess, $FFFF);
  2.  

With Delphi the time is:
Execution time: 1.87 seconds

Thaddy

  • Hero Member
  • *****
  • Posts: 17384
  • Ceterum censeo Trump esse delendam
Re: FPC for high-performance computing
« Reply #31 on: June 02, 2025, 08:04:39 am »
Apart from the optimization settings I referred to, I also miss measurements with the FPC llvm back-end. (Well, I tested that previously and on average fpc-llvm is on a par with any llvm backed compiler.  :D Which is logical)
Don't use the excuse that llvm is not native: from that excuse follows that none of the llvm compilers are native, maybe with the exception of clang. And many in speed tests above may be in fact llvm backed.
« Last Edit: June 02, 2025, 08:10:05 am by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Lenny33

  • New Member
  • *
  • Posts: 49
Re: FPC for high-performance computing
« Reply #32 on: June 02, 2025, 12:50:07 pm »
You don't get it do you? Like most of the other idiots that "compare" speed.
It is ff'ing not the language, it is the compiler optimization and a lot of C++ compilers are slower than fpc.
If you do still not understand that, please clean your mouth. Such statements are silly naive.
Why so boorish to people who write objective things?
Yes, Delphi and Lazarus are not designed for writing high-performance computing code.
The final code compiled on modern C/C++ compilers (f.ex. on VC++ or Intel C++) usually runs twice as fast as on Delphi or FPC.
Even code compiled on Embarcadero C++ Builder (the same owner of Delphi) usually works one and a half times faster than Delphi (though C++ Builder is obviously not the best C++ compiler).

In general I think we want too much from an opensource project.
We can make such claims to Delphi as a commercial project and wonder why Embarcadero has not yet been able to hire specialists to optimize the final code.
As for Lazarus, we are just grateful that it is what it is. At least its final code is not slower than C#.

« Last Edit: June 02, 2025, 12:53:02 pm by Lenny33 »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11443
  • Debugger - SynEdit - and more
    • wiki
Re: FPC for high-performance computing
« Reply #33 on: June 02, 2025, 01:08:34 pm »
You don't get it do you? Like most of the other idiots that "compare" speed.
It is ff'ing not the language, it is the compiler optimization and a lot of C++ compilers are slower than fpc.
If you do still not understand that, please clean your mouth. Such statements are silly naive.
Why so boorish to people who write objective things?
Yes, Delphi and Lazarus are not designed for writing high-performance computing code.
The final code compiled on modern C/C++ compilers (f.ex. on VC++ or Intel C++) usually runs twice as fast as on Delphi or FPC.

There are a lot of inaccuracies (or at least potential misunderstandings) in there....

But first of all, yes "Mr Devil smiley" (sorry Thaddy, that is you) could have been way more polite (or at least not aggressive/insulting in any way), and not just the once. No matter what the issue is, and how often it has been iterated.


At the basics, when comparing 2 equal implementations, then it is a comparison of compiler.

There isn't a "Pascal speed" to some code. There is a speed that you get with FPC, a speed for Delphi, and for any other compiler compiling Pascal.

So that kind of comparison (that ends in a time measured) is never about languages.
Though of course you can (if you spent time collecting all existing compilers) state that at a specific date the fastest compiler got whatever results.


Quote
Delphi and Lazarus are not designed for writing high-performance computing code.

Ignoring that "Lazarus" isn't the language, and neither the Compiler....

Talking about Pascal and other language, then well a language can be designed for performance, by making it easier to access performance improving feature such as parallel execution.

But, then - in most case - you can access those features (like threading) even if the design of the Language does not give you special features to make that easier.

And if you do access them, then its back to how fast the code generated by the compiler is.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11443
  • Debugger - SynEdit - and more
    • wiki
Re: FPC for high-performance computing
« Reply #34 on: June 02, 2025, 01:15:54 pm »
Btw, I find the comparisons in this thread a bit lacking.

The may be valid (I have not checked), but they compare (at least) 2 features without providing details on how each individual feature performs.

If I read it correct, then all benchmarks were performed multi threading. And well the over all outcome is a valid comparison.

But, from that little can be said about the compiler.
It's a mix of compiler, rtl and libraries.

That is, it is unclear if the faster speed was a result of
- better thread management (the rtl/library)
- faster execution inside each thread (the compiler generated code)

I would think it should be interesting to also know how the benchmarks perform single threaded.



It would also be telling to know how the time comparisons go when changing the "size" of the benchmark.

If all test are run, with double, tripple or 10 times the data => does the time scale by the same factor?

Or do some tests gain/loose performance in comparison to others?

Lenny33

  • New Member
  • *
  • Posts: 49
Re: FPC for high-performance computing
« Reply #35 on: June 02, 2025, 01:26:04 pm »
Ignoring that "Lazarus" isn't the language, and neither the Compiler....
Yes, you are right, but Lazarus doesn't work with anything other than FPC. So we all know what we're talking about.

Talking about Pascal and other language, then well a language can be designed for performance, by making it easier to access performance improving feature such as parallel execution.
Parallel computing is not the problem.
The problem is that FPC or Delphi, unlike the flagship C++ compilers, obviously do not have an intermediate layer which analytically, not syntactically, looks through the sequence of actions between variables and memory and simply removes unnecessary moves, so it really analytically optimizes actions.
I noticed this just by analyzing the final assembly code on the same algorithms.
You can do such “optimizations” manually as well. But it is very bad from the point of view of code perception and its purity (for someone else to understand it later).
But it is quite obvious that such an analytical layer in the compiler is rather difficult to implement and costs a lot of money (man-hours).

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11443
  • Debugger - SynEdit - and more
    • wiki
Re: FPC for high-performance computing
« Reply #36 on: June 02, 2025, 01:53:21 pm »
The problem is that FPC or Delphi, unlike the flagship C++ compilers, obviously do not have an intermediate layer which analytically, not syntactically, looks through the sequence of actions between variables and memory and simply removes unnecessary moves, so it really analytically optimizes actions.

There I misunderstood you, but then I said the text was not that accurate.
Because you name "Lazarus" and may (as in maybe) have referred to language design, I responded to that.

Above made clear you meant implementation in the compiler. Even in regards to the compiler "design" can have many many meanings.... But yes certain techniques (like DFA) are not or not completely implemented.

Thaddy

  • Hero Member
  • *****
  • Posts: 17384
  • Ceterum censeo Trump esse delendam
Re: FPC for high-performance computing
« Reply #37 on: June 02, 2025, 02:02:40 pm »
Parallel computing is not the problem.
The problem is that FPC or Delphi, unlike the flagship C++ compilers, obviously do not have an intermediate layer which analytically, not syntactically, looks through the sequence of actions between variables and memory and simply removes unnecessary moves, so it really analytically optimizes actions.
and what makes you think you are right?
Code: Text  [Select][+][-]
  1. fpc -io
  2. REGVAR
  3. STACKFRAME
  4. PEEPHOLE
  5. LOOPUNROLL
  6. TAILREC
  7. CSE
  8. DFA
  9. STRENGTH
  10. USERBP
  11. ORDERFIELDS
  12. FASTMATH
  13. REMOVEEMPTYPROCS
  14. CONSTPROP
  15. USELOADMODIFYSTORE
  16. UNUSEDPARA
  17. FORLOOP
missing any analytical optimizations?
oh, yes, probably whole program optimization: disappointingly to you  :o we have that too!!!! now, what?  :P
Furthermore there is a llm backend (same speed as all other llvm based compilers). If you use that you have also all linker optimizations, although the internal linker already optimizes and the compiler can control external linker options.
ok... a well deserved  >:D >:( for the obviously ill-informed. ;)
« Last Edit: June 02, 2025, 02:14:22 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11443
  • Debugger - SynEdit - and more
    • wiki
Re: FPC for high-performance computing
« Reply #38 on: June 02, 2025, 02:09:20 pm »
and what makes you think you are right?

What does make you think that just because the compiler can list a feature by name, that feature is actually fully/mostly/sufficiently-complete implemented? Some may, other may only be partly...

Lenny33

  • New Member
  • *
  • Posts: 49
Re: FPC for high-performance computing
« Reply #39 on: June 02, 2025, 02:09:29 pm »
Even in regards to the compiler "design" can have many many meanings.... But yes certain techniques (like DFA) are not or not completely implemented.
Whatever has not yet been realized in FPC and Lazarus, overall this project can be considered a great success.
And if to argue alegorically, it is foolish to expect from an SUV the speed of a sports car.
A programmer should not fixate on one language and technology. You can write a C++ library for fast high-performance calculations. But at least it is not too smart to write a WEB-site in the same C++.

Thaddy

  • Hero Member
  • *****
  • Posts: 17384
  • Ceterum censeo Trump esse delendam
Re: FPC for high-performance computing
« Reply #40 on: June 02, 2025, 02:19:39 pm »
What does make you think that just because the compiler can list a feature by name, that feature is actually fully/mostly/sufficiently-complete implemented? Some may, other may only be partly...
All of them are implemented. Only one or two have room for improvement. These are also related.
The point is that he talks about missing analytical functionality. That is plain wrong. You can't argue against that.
It is a troll.
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Lenny33

  • New Member
  • *
  • Posts: 49
Re: FPC for high-performance computing
« Reply #41 on: June 02, 2025, 02:19:45 pm »
oh, yes, probably whole program optimization: disappointingly to you  :o we have that too!!!! now, what?  :P
I would be happy to agree with you.
But whatever we did with optimization parameters for FPC, but only VC++ libraries on Win still work much faster.
And pesonally I don't see any problem with that.
Are you interested in nothing else in your life and profession except Pascal? Then your behavior compares to fanaticism.
Studying and being able to use different technologies suitable for a particular task is also a part of professionalism.
Being a fanatic of one language or technology is IMHO a stubborn fanaticism.

LV

  • Sr. Member
  • ****
  • Posts: 303
Re: FPC for high-performance computing
« Reply #42 on: June 02, 2025, 03:52:27 pm »
I'm a fan of the bike, though.  ;)
However, I've now made a sequential run of C++ (VS 2022, Release, maximum speed settings) and Pascal (FPC 3.2.2 -O3) programs. Please take a look at post 21 for the source code.
Hardware: laptop with an i7-12700H processor.
System: Windows 11.

Simulation time: 0.01; 0.1; 0.2; 0.3  :-[

C++

Code: Text  [Select][+][-]
  1. Total steps: 11112
  2. Execution time: 2.653 seconds
  3. Press Enter to exit...
  4.  
  5. Total steps: 111112
  6. Execution time: 38.798 seconds
  7. Press Enter to exit...
  8.  
  9. Total steps: 222223
  10. Execution time: 118.275 seconds
  11. Press Enter to exit...
  12.  
  13. Total steps: 333334
  14. Execution time: 244.510 seconds
  15. Press Enter to exit...
  16.  

Pascal

Code: Text  [Select][+][-]
  1. Enter number of threads: 20
  2. Total steps: 11112
  3. Execution time: 2.31 seconds
  4. Press Enter to exit...
  5.  
  6. Enter number of threads: 20
  7. Total steps: 111112
  8. Execution time: 36.20 seconds
  9. Press Enter to exit...
  10.  
  11. Enter number of threads: 20
  12. Total steps: 222223
  13. Execution time: 107.42 seconds
  14. Press Enter to exit...
  15.  
  16. Enter number of threads: 20
  17. Total steps: 333334
  18. Execution time: 201.41 seconds
  19. Press Enter to exit...
  20.  

Lenny33

  • New Member
  • *
  • Posts: 49
Re: FPC for high-performance computing
« Reply #43 on: June 02, 2025, 05:07:26 pm »
If we help the compiler a little and unroll the loops in both programs, the results will be as follows (difference 34%).
In 2025, everyone is used to the fact that unrolling and optimizing loops, rearranging and optimizing steps for memory access, inlining subroutine code, and other actions to get the most optimal machine code should be done by an intelligent compiler, not a programmer.
Forget the shamanism of manually unrolling loops. It's not your business. In 2025 an intelligent compiler should do it (of cause if we speak about high-perfomance computing).
The task of an application programmer is to write the most understandable, beautiful and self-documented code that can be understood both by himself in a couple of months and by another specialist.

Writing parts of an application program (not a system or controller program) in assembly language is an extremely unsuccessful approach from the point of view of portability.
More then it is almost impossible to know all the peculiarities of modern processors. For example, some assembly commands that everyone used in the 90s-2000s will only slow down program execution now.

That's when the next commercial Unreal Engine or MatLab will be rewritten in Delphi or FPC to squeeze maximum performance out of the hardware, that's when you can say that Delphi / FPC are definitely meant for high performance computing.
Until then, we will mostly use Lazarus/FPC as a great tool for creating cross-platform desktop interfaces and for writing interfaces to databases.
« Last Edit: June 02, 2025, 05:09:41 pm by Lenny33 »

LV

  • Sr. Member
  • ****
  • Posts: 303
Re: FPC for high-performance computing
« Reply #44 on: June 02, 2025, 05:37:07 pm »
Thank you for the interesting lecture. I should head out now and go for a bike ride. :)

 

TinyPortal © 2005-2018