Recent

Author Topic: How optimized is the FPC compiler  (Read 40625 times)

Shpend

  • Full Member
  • ***
  • Posts: 167
Re: How optimized is the FPC compiler
« Reply #30 on: December 21, 2020, 11:34:19 am »
Quote
* unions
* general Pointer-arithmetic
* embeded Assembler code
* Variant record fields
* {$pointermath on}
* inline assembler is fully supported on many targets.
Also note Pascal is older than C (1970  vs 1972  )
Btw, you only wrote what i already mentioned, @thaddy :D My argument stated that I love that FPC has already those things and would highly benefit the other constructs C++ offer, honestly

Shpend

  • Full Member
  • ***
  • Posts: 167
Re: How optimized is the FPC compiler
« Reply #31 on: December 21, 2020, 11:38:35 am »
I write speed dependent (Vision) applications, but don't use (or have an use case) for the examples that Warfley gives at all

This is not an arguement mate, if this would be the case, a C# 2D engine i recently saw its src-code (was written fully in .NET 4.0, even having acess to core 3.1 LTS) would mean for .NET theyy dont have to do any optimizations or offer any language feature because they already are not apparently of much importance due to the existence of that 2D engine, cuz they didnt need apparently any higher language feature, I m not a big fan of those type of arguments tbh..

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: How optimized is the FPC compiler
« Reply #32 on: December 21, 2020, 12:23:48 pm »
am I wrongthinking that Object Pascal was made to compete vs C++ and Plain Pascal was about to compete with C?

Yes, that's wrong. Pascal was created as a (kind of) substitute to Algol, when Wirth got tickled because the Algol comitte didn't accept his Algol-W extensions/simplifications for the second Algol Standard (later Algol 68), and because he wanted a better structured language for teaching. Nothing to do with C at all, other than they are "cousin" descendants of Algol, and both were created at around the same time (early 70s, though IIRC Pascal was created/published before C).

Later, in the mid-80s, Apple needed object extensions to Pascal for their then new Macintosh computer (and the previous Lisa) so they added them with inspiration from various OO languages (like SmallTalk). OOP was all the rage then so in the late 80s Borland (and others) took features from both (Mac) Object Pascal and the early attempts by Stroustrup at a C++, though they took abstract concepts rather than concrete paradigms/syntax from the later.

You can find a brief (and not very exact) history of Object Pascal and a slightly better one of Pascal in the Wikipedia. ;)
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5486
  • Compiler Developer
Re: How optimized is the FPC compiler
« Reply #33 on: December 21, 2020, 01:27:44 pm »
Im actually really abit wondered why these "move" semantics are not part of Pascal since Pascal was the Idea to replace (even if not succesfully done so) C/C++ family by providing exactly those kind of native/low-lvl intrinsics/API's but expose them in a cleaner and more maintainable way.

The management operators were only released with 3.2.0. They are a relatively new feature. And Pascal as such does not know the concept of move semantics, thus there was no need to include them in the concept.

So actually this would be really nice to have FPC support also the Move-keyword ("std::Move")

If we can determine clear rules for the compiler when move semantics should be used then one can talk about it. Though even then FPC is a project developed by volunteers in their free time. If none of the devs should be interested in that... though luck... *shrugs*

and possibility to overload the "+= ", "-="

These operators are simply syntactic sugar and nothing more.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: How optimized is the FPC compiler
« Reply #34 on: December 21, 2020, 01:45:35 pm »
would you actually think that my mentioned things to add to FPC, like the "std::move" and "+=/-=" would make it to FPC?

I'm no compiler implementer. I just agree based on own practice that the Delphi/FPC dialect has some weaknesses in efficient valuetype handling in some constructs. Also it is not just about final code efficiency, but also syntax related oddities like most container types having an iterator type that is defined by value, making it impossible to mutate additional fields in  a for..in loop, since the loop/iterator var is a copy.

I don't know if that warrants extensions, and if so with which priority. I also are deliberate vague on the form of the extension, which is a reasonable caution if you look at e.g. the std::move  definition with a "class" as argument, which is a reference type in FPC.

So a lot more research and usecases would need to be presented than just mumbling "std::move" or "copy from C++" or droning on about performance (which it is only for fairly small class of performance requiring applications).

I just don't discount the notation that sooner or later something needs to be done there for (more) efficient valuetype processing in some constructs (the operator overloading as commented by Warfley, and my own preferences which are more STL/generics  oriented)

But that discussion should be fact based and to the points, and even the the question remains who would implement it.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11455
  • FPC developer.
Re: How optimized is the FPC compiler
« Reply #35 on: December 21, 2020, 01:53:36 pm »
This is not an arguement mate, if this would be the case, a C# 2D engine i recently saw its src-code (was written fully in .NET 4.0, even having acess to core 3.1 LTS) would mean for .NET theyy dont have to do any optimizations or offer any language feature because they already are not apparently of much importance due to the existence of that 2D engine, cuz they didnt need apparently any higher language feature, I m not a big fan of those type of arguments tbh..

Then why isn't this the case for C++ too ? Since this kind of stuff is likely to be the rate determining step for a very small group. (and even they could simply write it out, or generate it).

I'm somewhat similar to the C# example in that the first primitives to operate on an image are by far the most rate determining step if other factors are somewhat decently (but not spectacularly complicated or fancy).

So in most of my code I can use HLL code just fine, just not on every pixel.

Some exceptions to that are blob (which probably would be much faster with a better compiler as it is a very complex loop) and the mixed radix FFT routine, which I use for filtering all incoming data in some routines. I have some SSE code for that but it is not live yet.


Shpend

  • Full Member
  • ***
  • Posts: 167
Re: How optimized is the FPC compiler
« Reply #36 on: December 21, 2020, 02:04:14 pm »
I understand @PascalDragon, I do respect and fully aknowledge ofc that FPC has not a big company behind it, and thus it has become a great compiler, indeed, Im only saying that FPC can , IMHO, easily compete in a language to language "battle" with C++ based on those relatively minor additions, since FPC is kind of already very similar to C++ in terms of capabilty, I guess at least.

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: How optimized is the FPC compiler
« Reply #37 on: December 21, 2020, 04:22:29 pm »

am I wrongthinking that Object Pascal was made to compete vs C++ and Plain Pascal was about to compete with C?
I don't think object pascal is made to compete with C++, at least not what was implemented in Delphi over the time.
Classes in particular are weird if you look at pascals history. They are always placed on the heap, always referenced by pointers, but completely hide this from the user but still require manual memory management, even though this is completely different from regular memory management (new, dispose) used by oldschool objects.
These classes and the subsequent implementation of the RTL an VCL using classes, results more in Dephi getting much more similar to Java than to C++.

From a performance programming perspective this makes absolutely no sense. For example using a TStringList to split a string requires the additional allocation of the class on the heap. With an old style object, the required variables that the stringlist uses internally would be placed on the stack and no overhead would be gained than if no OOP was used.
Classes always bring additional overhead, and due to manual memory management. Also the standard libraries heavily make use of abstract base classes and inheritance with virtual methods and stuff. In fact some methods like the destructor must be always virtual. If you look at C++ this is not the case. Sure the C++ standard library implementations also makes heavy use of inheritance for reducing the code complexity, but in most of the classes like vector, set, etc. you won't find any virtual methods. More often than not virtual methods are avoided using the CRTP idiom which allows for static or bounded polymorphism using templates. This limits or completely avoids virtual call chains and makes a lot of the code inlinable and getting rid of virtual table jumps.
Another thing I found about C++ is the heavy usage of templates (generics in pascal) for class configuration. For example, to implement a custom sorting algorithm for a TStringList, you set a function pointer in the TStringList instance. In C++ you configure such things via template parameters. This has the consequence, that in Pascal this code can not be optimized, as the compiler does not know at compiletime what function will be used, while in C++ this is a complete runtime decision.

With all of that, I don't think that Delphi is (anymore) really a competitor to C++, it is much more going into the Java direction. This is btw IMHO nothing wrong. Java is a successful language because it makes a lot of things very easy, so does Delphi. CRTP is much more complicated than classical inheritance, C++ templates are turing complete, which makes them great to put computations into the compiletime and give the optimizer more to work with, but simulatniously template programming is really complicated.
If this is a good or a bad thing is something one can evaluate for himself. If you don't need maximum efficiency, this kind of stuff would require much more effort for people to learn and use Pascal. And still, it is not hard to use a C++ library in Pascal, so one can simply use the language best fitted for the task.
« Last Edit: December 21, 2020, 04:24:35 pm by Warfley »

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: How optimized is the FPC compiler
« Reply #38 on: December 21, 2020, 04:35:41 pm »
I wonder how many of these are already solvable using management operators though.
Since the release of FPC 3.2 I am actually working on several projects where I try to use them for creating more efficient alternatives to things commonly done using classes, and personally I think that they are a great feature that allow for a lot of new high level expressions that where impossible beforehand.

But sadly, as I already mentioned before, due to this bug they are simply not usable currently, at least not in situations where you rely on the finalization, as when functions return a managed record it get's freed while still being active. This can result in double free or use after free, etc. and basically means that for anything that needs to be managed, the management operators are not usable. So all these projects of mine are currently on hold.

But, using managed records instead of classes with generics for simple inheritance, I already built some types that allow for pretty nice highly efficient usage. So yeah, theoretically there is a lot of stuff one can do with management operators.

One example is the use of enumerators, why are enumerators so often implemented as classes? If you enumerate only a few elements, the creating and freeing of the class makes a massive performance difference. Using managed records (or in many cases normal records are enough), can massively improve the performance if you have few elements and a small loop body (i.e. the memory management dominates the runtime)
« Last Edit: December 21, 2020, 04:43:17 pm by Warfley »

nanobit

  • Full Member
  • ***
  • Posts: 160
Re: How optimized is the FPC compiler
« Reply #39 on: December 21, 2020, 05:29:13 pm »
I would really love to see how is the current state of the FPC, what could be done better and are ppl interessted in  doing so)

My observation over the last few years (earlier I don't know) is:
A lot of time is spent on level 3+ optimization. But the real question is how many actually use it.
Personally, in order to minimize the number of bugs (which is my top priority),
I don't dare to use more than level 2 and I'm happy with that approach.

In addition, FPC allows to write SSE algorithms or to use external libraries.
For SSE I would take a look at https://ispc.github.io/index.html and build a dll

Shpend

  • Full Member
  • ***
  • Posts: 167
Re: How optimized is the FPC compiler
« Reply #40 on: December 21, 2020, 05:40:12 pm »
Ok thx for the clarification :)

Yea i see ur point @Warfley, delphi and subsequently also FPC orients more to C#/Java based language with a tend to C++, like sort of a hybrid, but I really still think thats it doesnt hurt the language if some effective C++ possibilities (like for instance this entire managed records stuff and move semantics..) to allow for kind of efficient code, aswell if need be, since i think that FPC is used as i saw abit back then, in emulators aand kind of LowLvl codee which could benefit from those optimizations.

@nanobit
yea will do that:

BTW!: is it actually possible, to just build a wrapper around in C++ which wrapps the entire (or whast is needed from it.. "STD::"? and use it then in pascal?

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: How optimized is the FPC compiler
« Reply #41 on: December 21, 2020, 06:12:08 pm »
These operators are simply syntactic sugar and nothing more.
Sure, this *is* the case, but the question is *should* that be the case?

There are clear advantages of having separate +=, -=, etc. operators, as this does not need a temporary object. And more often than not can be done inplace.
Example string: str += str2 can either be implemented as:
Code: Pascal  [Select][+][-]
  1. setlength(tmp, length(str) + length(str2));
  2. move(str[1], tmp[1], length(str));
  3. move(str2[1], tmp[length(str) + 1], length(str2));
  4. str := tmp;
or implemented as:
Code: Pascal  [Select][+][-]
  1. oldlen := length(str);
  2. setlength(str, oldlen + length(str2));
  3. move(str2[1], tmp[oldlen + 1], length(str2));
In the worst case, the latter code is equivalent to the former (if the memory manager can't simply append enough space, or the refcount is > 1) But in a lot of cases the latter one is massively more efficient, as it avoids a complete string copy.

Therefore even on standard types, having these as seperate operators can have massive benefits. I honestly don't see why this shouldn't be implemented.

but I really still think thats it doesnt hurt the language if some effective C++ possibilities (like for instance this entire managed records stuff and move semantics..) to allow for kind of efficient code

I agree, this post was simply a statement on what Delphi is and why it is so different in so many regards from for example C++. These were deliberate design decisions. Delphi simply was not made to be like C++. This of course does not imply in any way that in the future it can not use ideas from that language. Management operators, especially the way they are implemented in Delphi (using constructors, destructors and the assign operator) are an example for this

Awkward

  • Full Member
  • ***
  • Posts: 135
Re: How optimized is the FPC compiler
« Reply #42 on: December 21, 2020, 06:46:42 pm »
Warfley, didn't you noticed what your code example with strings are different on your "level" only. but still almost the same on lowlevel? when you will change size of string, it will combine new string allocation and copying inside memorymanager anyway.

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: How optimized is the FPC compiler
« Reply #43 on: December 21, 2020, 07:03:28 pm »
Warfley, didn't you noticed what your code example with strings are different on your "level" only. but still almost the same on lowlevel? when you will change size of string, it will combine new string allocation and copying inside memorymanager anyway.
It only does so if it can not extend the string, as I said
In the worst case, the latter code is equivalent to the former (if the memory manager can't simply append enough space, or the refcount is > 1) But in a lot of cases the latter one is massively more efficient, as it avoids a complete string copy.
If the refcount of the string is 1, and there is free memory behind that block, the MM will simply extend this block. Also, I don't know if the FPC MM does this, but many memory managers, to avoid fragmentation, overalloc memory so it fits a certain size (e.g. a multiple of 16 bytes). If that is the case, adding less bytes than are overallocated is actually completely free.

My example is only in the worst case as bad as the solution using a temporary storage. But in a lot of cases it can perform massively better

BeniBela

  • Hero Member
  • *****
  • Posts: 908
    • homepage
Re: How optimized is the FPC compiler
« Reply #44 on: December 21, 2020, 11:36:56 pm »
So actually this would be really nice to have FPC support also the Move-keyword ("std::Move")

Oh no, that is incredible confusing in C++

There has to be a way to do it with a less  confusing syntax.

In practice, you can get move semantic in Pascal with assigning default() to the target, Move source to target, and FillChar on the source



I just agree based on own practice that the Delphi/FPC dialect has some weaknesses in efficient valuetype handling in some constructs. Also it is not just about final code efficiency, but also syntax related oddities like most container types having an iterator type that is defined by value, making it impossible to mutate additional fields in  a for..in loop, since the loop/iterator var is a copy

The worst is when the copy updates a reference count


In many of my collections the enumerator returns a pointer to the data

These classes and the subsequent implementation of the RTL an VCL using classes, results more in Dephi getting much more similar to Java than to C++.

From a performance programming perspective this makes absolutely no sense. For example using a TStringList to split a string requires the additional allocation of the class on the heap. With an old style object, the required variables that the stringlist uses internally would be placed on the stack and no overhead would be gained than if no OOP was used.
Classes always bring additional overhead, and due to manual memory management.

Worst thing Borland ever did

Another thing I found about C++ is the heavy usage of templates (generics in pascal) for class configuration. For example, to implement a custom sorting algorithm for a TStringList, you set a function pointer in the TStringList instance. In C++ you configure such things via template parameters. This has the consequence, that in Pascal this code can not be optimized, as the compiler does not know at compiletime what function will be used, while in C++ this is a complete runtime decision.

Even if it was configured as template/generics parameters, FreePascal could not optimize it, since it is not so good as optimizing anything


Since the release of FPC 3.2 I am actually working on several projects where I try to use them for creating more efficient alternatives to things commonly done using classes, and personally I think that they are a great feature that allow for a lot of new high level expressions that where impossible beforehand.

Me too!

But sadly, as I already mentioned before, due to this bug they are simply not usable currently, at least not in situations where you rely on the finalization, as when functions return a managed record it get's freed while still being active. This can result in double free or use after free, etc. and basically means that for anything that needs to be managed, the management operators are not usable. So all these projects of mine are currently on hold.

6 months old? They really should have fixed it by now

A lot of time is spent on level 3+ optimization. But the real question is how many actually use it.
Personally, in order to minimize the number of bugs (which is my top priority),
I don't dare to use more than level 2 and I'm happy with that approach.

I mostly use level 1 or 2, since level 3 has crashed far too often. Especially on arm

 

TinyPortal © 2005-2018