Recent

Author Topic: How optimized is the FPC compiler  (Read 34004 times)

440bx

  • Hero Member
  • *****
  • Posts: 2763
Re: How optimized is the FPC compiler
« Reply #135 on: January 02, 2022, 08:16:28 am »
<snip> no function without the `inline` modifer will ever be inlined, no matter what).
I just wanted to say that is a good thing. It makes the resulting code completely predictable which is often useful when debugging.

IOW, a function should never be inlined unless the "inline" modifier is present (and, if present, the compiler has to find it "acceptable")
FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

Akira1364

  • Hero Member
  • *****
  • Posts: 559
Re: How optimized is the FPC compiler
« Reply #136 on: January 02, 2022, 10:28:12 am »
<snip> no function without the `inline` modifer will ever be inlined, no matter what).
I just wanted to say that is a good thing. It makes the resulting code completely predictable which is often useful when debugging.

IOW, a function should never be inlined unless the "inline" modifier is present (and, if present, the compiler has to find it "acceptable")


The current situation has far more downsides than upsides. A proper automatic implementation as is found in compilers for nearly every language I can think of (that could still be turned off per-function if necessary with stuff like the fairly new `noinline` modifier that does the opposite of `inline`, and also would probably just vary in aggressiveness to start with between -O1, -O2, -O3 etc) would be vastly better as the endgame solution overall, as then you wouldn't have to rely on various people who may or may not understand why it's actually important to appropriately set the 'inline' flags themselves in code you don't necessarily have direct control over.
« Last Edit: January 02, 2022, 10:38:53 am by Akira1364 »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 7883
  • Debugger - SynEdit - and more
    • wiki
Re: How optimized is the FPC compiler
« Reply #137 on: January 02, 2022, 11:33:10 am »
Obviously it'd be nice if FPC had a more advanced form of the "AutoInline" switch turned on at all times so that you could just rely on it to do the right thing, but as it stands in reality it does the opposite (which is to say, no function without the `inline` modifer will ever be inlined, no matter what).

Have you tried?
Code: Text  [Select][+][-]
  1. -OoAUTOINLINE
or
Code: Text  [Select][+][-]
  1. {$OPTIMIZATION AUTOINLINE}
https://www.freepascal.org/docs-html/prog/progsu58.html

It's not on by default, but you can add it to your fpc.cfg.

Akira1364

  • Hero Member
  • *****
  • Posts: 559
Re: How optimized is the FPC compiler
« Reply #138 on: January 02, 2022, 11:51:54 am »
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 7883
  • Debugger - SynEdit - and more
    • wiki
Re: How optimized is the FPC compiler
« Reply #139 on: January 02, 2022, 01:12:24 pm »
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.
Well yes.

(Afaik) Inlining, does not just copy the already generated assemble code into the new location (replacing the call).

Inlining, (re-)compiles (partly) the code (from the called proc) in the new location. (Afaik it copies the "nodes" (representing the parsed, and maybe partly processed/compiled code) into the new place. From that the final assembler is generated).

But that is only possible, if those nodes exist. And they only do, if a proc was compiled with the indent to be inlined.


If for example some commercially sold code is released as ppu/o only, then they may not want to distribute such extra info (which could be used to gain insight into their unpublished code).

PascalDragon

  • Hero Member
  • *****
  • Posts: 4014
  • Compiler Developer
Re: How optimized is the FPC compiler
« Reply #140 on: January 03, 2022, 02:08:28 pm »
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.

Of course it does not, because it needs to have the node tree available in the PPU which is only the case if the routine has the inline directive or was compiled while the AutoInline optimization was enabled.

Akira1364

  • Hero Member
  • *****
  • Posts: 559
Re: How optimized is the FPC compiler
« Reply #141 on: January 09, 2022, 02:17:47 am »
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.

Of course it does not, because it needs to have the node tree available in the PPU which is only the case if the routine has the inline directive or was compiled while the AutoInline optimization was enabled.

Right, yeah, I'm aware of why it doesn't work that way currently. Again, my original comment was envisioning a world where FPC just auto-inlined appropriately all the time when compiling code (as most compilers do, at least nowadays) regardless of the presence of any particular modeswitch.

If for example some commercially sold code is released as ppu/o only, then they may not want to distribute such extra info (which could be used to gain insight into their unpublished code).

I think if that sort of thing was actually a realistic concern in conjunction with inlining specifically, it would be a known issue to at least some extent in other languages. It's not though, at least as far as I've ever heard.
« Last Edit: January 09, 2022, 02:20:46 am by Akira1364 »

abouchez

  • New Member
  • *
  • Posts: 44
Re: How optimized is the FPC compiler
« Reply #142 on: February 04, 2022, 02:20:43 pm »
After decades working with Delphi, and now FPC as my main compiler, I don't think FPC is much slower. It is actually faster, in some areas.
Especially since FPC 3.2 code generation was much better. Much less bloated asm as it did with previous versions.

I agree that the RTL is sometimes non-optimal (missing "const" or "inline").
But its purpose was to be cross-platform and maintainable.
This is why with my mORMot I rewrote most of the RTL, and bypassed it, writing some optimized asm for the most critical part.
https://github.com/synopse/mORMot2

Honestly, the server side is where performance matters - especially multi-threaded performance.
For LCL or client applications, compiler quality is less essential. You can have reactive apps written in python. :)

When I run the mORMot 2 regression tests, I have a huge set of realistic benchmarks, which cover a lot of aspects, like data structures, JSON, cryptography, http/websockets client and server, database, multi-threading....
Then we can compare between compilers and systems.
The fastest is with FPC and Linux on x86_64, with our own x86_64 memory manager written in asm. https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.fpcx64mm.pas is worth a try and gives a noticeable performance boost on server processing and memory usage, especially on multi-threading process.

From those regression tests, Delphi is behind, especially due to the Windows overhead, and also to more tuned pascal code targeting FPC.
Delphi outside Win32/Win64 is simply not optimized - even if they use LLVM as codegen backend, LLVM is not leveraged and resulting asm is not very good.
So I stick with FPC as my main target for optimization. And even if AARCH64 codegen is not optimal, it is pretty stable and we can have good numbers with a few bit of asm and statically linked library - see https://blog.synopse.info/?post/2021/08/17/mORMot-2-on-Ampere-AARM64-CPU

Of course, this is only an opinion.
And I like FPC so much that I am officially biased.  O:-) But I have no wish to come back to Delphi. Lazarus is so more light and stable for daily use - and with fpdebug starts to be good enough for debugging. But as a compiler, FPC is amazing and do a very good job.
Open Source rocks!

 

TinyPortal © 2005-2018