Lazarus

Free Pascal => FPC development => Topic started by: Shpend on December 19, 2020, 03:54:26 pm

Title: How optimized is the FPC compiler
Post by: Shpend on December 19, 2020, 03:54:26 pm
Actually Title.

But in addition, I m curious how much ov, high-perf (internal)Vectorization, MxM mulitplications, MMX and so on is implemented by the fpc, so that apps developed by the fpc are actually considered (really) fast or medium fast or whatever :D

And in what regard is it for instance slower than Delphi 10.4 compiler or even current C++ 21 Clang compiler what could be done better(im far away from compiler knowledge actually let alone compiler optimizations xD but I would really love to see how is the current state of the FPC, what could be done better and are ppl interessted in  doing so)

** note ** this is not a bashing against fpc regardless what the outcome of the optimization is, sinc eim fullyy aware that its not a 1000 man company behind fpc   ;)


Would love to know about these.
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 04:16:25 pm
I use Pascal at work. FPC and Lazarus.
And it runs fast on really old hardware and PC with like 2gb of ram.

Usually the bottleneck when using databases like sqlite is the hard disk. Is better to run on  SSD drives.

Is faster than Javascript that I also use at work, and it uses less memory for same object structures. Say tables with 1000 or more complex elements.

Usually it gets slow with bad written code, not for the compiler itself. So it depends more on what you write at the end.

Like for example using fpjson is slower than using third party jsontools. And jsontools is slower than using sqlite json extension.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 19, 2020, 04:22:54 pm
Yea thats nice to hear, its mostly that , why I asked, to know how it would actually compete in terms of opts with C++ Clang compiler how big is the optimization difference there.
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 04:29:40 pm
I can't say since the tools I mentioned and also kotlin is what I use at work.
Never used c++ to do a gui or console application. But AFAIK is harder no visual designer or not a cross platform library except for QT that's not free.
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 04:30:52 pm
Time say it in other words I see no other option than using FPC for linux mac windows and desktop gui that can run on old hardware. Tell me if you know one.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 19, 2020, 04:41:50 pm
No i dont know haha, im just trying to open up a discussion about this topic not to do really any suggestions to replace it lol
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 04:45:29 pm
No problem. Because I faced the same question before. But at the end you can see that if FPC is a bit slower than C++ that doesn't mean it's bad.

It depends for what you will use it. If it's for cross platform desktop GUI and also Free, there is no other best option.

But if you do command line go and use C++ for sure if you know how to use it you can produce a more optimized tool.
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 04:52:49 pm
To be fair, if you feel that your target is GUI here a list of GUI tools to compare
https://en.wikipedia.org/wiki/List_of_platform-independent_GUI_libraries

And if you want to compare speed in console applications, feel free too =)

But please provide some compilable code in a real use case, else we're talking about air.

Like real world applications done in C++ and Pascal at the same time...

No i dont know haha, im just trying to open up a discussion about this topic not to do really any suggestions to replace it lol

Sorry if I was a bit offtopic, but I think the only way to compare compilers it's by it's output program, for that I said that we need to compare compilable code.

I'm out since I already said what I wanted to contribute.

In my opinion is good to optimize FPC, but I have no idea on how to do that so I can't contribute anymore. Also I don't have any idea on C++, so I can't compare FPC and that language anyways.

Keep it running =)
Title: Re: How optimized is the FPC compiler
Post by: lainz on December 19, 2020, 06:15:32 pm
I mean you wanted to compare the speed of the final application, and not the speed of the compiler compiling that application. For that we need to compare source code and final binaries, like measuring.

Say we can compare equivalent pieces of source code and see how they are finally assembled, if they run fast or slow.

I think that can continue the discussion from that, and forget what I said, that's merely stuff for another discussion.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 19, 2020, 06:25:19 pm
There have been quite a few discussions on this topic here (and some got a bit out of hand).

I am not familiar enough with the compiler internals to answer this question in great detail (i.e. pick and compare individual optimization and indicate which compiler does or does not do them.)

I am also unclear, if you want to explicitly (and only) know about
Quote
  (internal)Vectorization, MxM mulitplications, MMX
or other opts too (like register alloc, eval during compile / replace by constant, dead code detection, ....)


There is some work being done on optimizations. But I do not know if it touches the features you mentioned. The people involved in that are generally found on the fpc mail list.
Title: Re: How optimized is the FPC compiler
Post by: Thaddy on December 19, 2020, 06:40:34 pm
The llvm back-end -  which fpc supports - is a better way to compare and it does not disappoint.
But note the default compiler CAN be much faster in specific cases and using the optimizations supported.
IOW, it is not the language itself.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 20, 2020, 03:36:24 pm
Sry for my late answer.. wanted to clearly think about this b4  i hurry to write smth about that topic.

So the thing is there is this Website which measures Performance of a viarity sets of different algorithms (heavy ones actually too, not trivial!) and often sadly (but obviously understandable considering that C/C++ has big companies behind it..) is signigficantly slower in execution and hence internal optimizations which maybe maybe can be doner abit better in the near future to make the FPC a really really neat compiler to consider even more :)

But suprisingly, the first 3 Algorithms seems to be really close to each other.


Here the link!
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/fpascal-gpp.html
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 20, 2020, 03:51:13 pm
But the really worriable part for me is actually, that there is only litterally 1 FPC programm regarding this algorithm here:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/revcomp.html

where as in average,  for every language/Compiler there were massive improvementys made for the performance side of it, like take C++ in that list, the algorithm was reworked like 3-4 times by ppl and everytime got a good piece of boost to it, or Java, 6 times reworked, and so on, and fpc only 1 (the 1.time released algorithm) and I think stronglyy this can be reworked too and maybe get some boost , im really sure!
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 20, 2020, 03:52:03 pm
this is the algorithm in code:

Code: Pascal  [Select][+][-]
  1. {  The Computer Language Benchmarks Game
  2.    https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
  3.  
  4.    contributed by Marco van de Voort
  5. }
  6.  
  7. program reverse_complement;
  8.  
  9. var lookupComplement : array[#0..#255] of char;
  10.  
  11. Const FASTAXLAT : array[0..11] of array[0..1] of char = (
  12.                   ( 'A', 'T' ), ( 'C', 'G' ),
  13.                   ( 'B', 'V' ), ( 'D', 'H' ),
  14.                   ( 'K', 'M' ), ( 'R', 'Y' ),
  15.                   ( 'a', 't' ), ( 'c', 'g' ),
  16.                   ( 'b', 'v' ), ( 'd', 'h' ),
  17.                   ( 'k', 'm' ), ( 'r', 'y' ));
  18.  
  19.       BufferIncrement = 1024;
  20.  
  21. procedure flushbuffer(buffer:pchar;inbuf:longint);
  22. var p,p2 : pchar;
  23.     c  : char;
  24. begin
  25.   if inbuf>0 then
  26.    begin
  27.      p:=buffer;
  28.      p2:=@buffer[inbuf-1];
  29.      while p<p2 do
  30.       begin
  31.        c:=lookupcomplement[p^];
  32.        p^:=lookupcomplement[p2^];
  33.        p2^:=c;
  34.        inc(p);
  35.        dec(p2);
  36.      end;
  37.     if p2=p then
  38.       p^:=lookupcomplement[p^];
  39.  
  40.     p:=buffer;
  41.     p[inbuf]:=#0;
  42.  
  43.    while (inbuf > 60) do
  44.      begin
  45.         c := p[60];
  46.         p[60]:=#0;
  47.         writeln(p);
  48.         p[60]:=c;
  49.         inc(p,60);
  50.         dec(inbuf,60);
  51.      end;
  52.      p[inbuf]:=#0;
  53.      writeln(p);
  54.   end;
  55. end;
  56.  
  57. const initialincrement=1024;
  58.  
  59. procedure run;
  60.  
  61. var s  : string;
  62.     c  : char;
  63.     buffersize,
  64.     bufferptr,
  65.     len         : longint;
  66.     p  :pchar;
  67.     line : integer;
  68.     bufin,bufout : array[0..8191] of char;
  69.  
  70. begin
  71.   settextbuf(input,bufin);
  72.   settextbuf(output,bufout);
  73.   for c:=#0  to #255  do
  74.     lookupcomplement[c]:=c;
  75.   for len:=0 to high(FASTAXLAT) do
  76.     begin
  77.       lookupcomplement[FASTAXLAT[len][0]]:=upcase(FASTAXLAT[len][1]);
  78.       lookupcomplement[FASTAXLAT[len][1]]:=upcase(FASTAXLAT[len][0]);
  79.     end;
  80.   buffersize:=initialincrement;
  81.   bufferptr :=0;
  82.   getmem(p,buffersize);
  83.   line:=0;
  84.   while not eof do
  85.     begin
  86.       readln(s);
  87.       inc(line);
  88.       len:=length(s);
  89.       if (len>0) and (s[1]='>') then
  90.           begin
  91.             flushbuffer(p,bufferptr);
  92.             writeln(s);
  93.             bufferptr:=0;
  94.           end
  95.        else
  96.          begin
  97.            if (bufferptr+len+1)>buffersize then
  98.              begin
  99.                 inc(buffersize,buffersize);
  100. //              inc(buffersize,initialincrement);
  101.                 reallocmem(p,buffersize);
  102.              end;
  103.            move (s[1],p[bufferptr],len);
  104.            inc(bufferptr,len);
  105.          end;
  106.     end;
  107.     flushbuffer(p,bufferptr);
  108. end;
  109.  
  110. begin
  111.   run;
  112.  
  113.  
  114.  
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 20, 2020, 05:03:51 pm
like take C++ in that list, the algorithm was reworked like 3-4 times by ppl and everytime got a good piece of boost to it

Let me see if I understand that correctly:  The C (and also Java and others) source code was tweaked over and over until eventually the compiler got a better result?
If that is true, then this is in not a comparison of compilers.

It simple means that there are more people willing to spent time on tweaking code. (And tweaking code can be based on pure luck too, since sometimes the most unexpected change may give you a better result)


The problem for comparing compilers (cross language) is that
- You need code, written to the same standard and quality. ZERO tweaks for any language.
   However that leaves you at odds, if one language does not have a feature that is useful for the test. (Punish all languages that have the feature? I think not)
- You need code that the compiler does not have special tweaks for.
   (That is a basic "binary search" could be recognized by a compiler, and the compiler could contain hand written optimized translations for it / like the diesel emission scandal)



There have been a few "competitions" on the forum for code tweaking.... (Sorry no links)
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 20, 2020, 05:17:42 pm
IOW, it is not the language itself.
I strongly disagree, the language can emphasize more or less efficient programming. For example in C++ you can distinguish between move and copy semantic on assignments. For example if you have a struct containing dynamically allocated memory. When you copy the struct, to have a unique copy you need to deep copy, i.e. copy the dynamic memory. If you move the struct, you know the struct you get your data from will not be touched afterwards, and therefore you can just grab the pointer and don't have to copy the data.

FPC does allow copy assignments. While you can work around that, this basically eliminates a lot of convinience when programming. Example:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4. {$ModeSwitch advancedrecords}
  5.  
  6. type
  7.  
  8.   { TMStream }
  9.  
  10.   TMStream = record
  11.     len: SizeInt;
  12.     Data: PByte;
  13.  
  14.     constructor FromString(const str: String);
  15.     class operator Copy(constref aSrc: TMStream; var aDst: TMStream); inline;
  16.     class operator Initialize(var a: TMStream); inline;
  17.     class operator Finalize(var a: TMStream); inline;
  18.     class operator :=(const str: String): TMStream; inline;
  19.   end;
  20.  
  21. { TMStream }
  22.  
  23. constructor TMStream.FromString(const str: String);
  24. begin
  25.   len := Length(str);
  26.   Data := ReAllocMem(Data, len);
  27.   Move(str[1], Data^, len);
  28. end;
  29.  
  30. class operator TMStream.Copy(constref aSrc: TMStream; var aDst: TMStream);
  31. begin
  32.   aDst.len := aSrc.len;
  33.   aDst.Data := ReAllocMem(aDst.Data, aDst.len);
  34.   Move(aSrc.Data^, aDst.Data^, aDst.len);
  35.   WriteLn('Move');
  36. end;
  37.  
  38. class operator TMStream.Initialize(var a: TMStream);
  39. begin
  40.   a.len:=0;
  41.   a.Data:=nil;
  42. end;
  43.  
  44. class operator TMStream.Finalize(var a: TMStream);
  45. begin
  46.   if Assigned(a.Data) then Freemem(a.Data);
  47. end;
  48.  
  49. class operator TMStream.:=(const str: String): TMStream;
  50. begin
  51.   Result := TMStream.FromString(str);
  52. end;
  53.  
  54. var
  55.   t: TMStream;
  56. begin
  57.   t := 'foo';
  58.   ReadLn;
  59. end.
The whole contents of the string is copied twice just for the initialization. Even without the implicit operator (i.e. t := TMStream.FromString('foo');) this is still a complete copy of the whole string.

In C++ this would look like this:
Code: Pascal  [Select][+][-]
  1. #include <iostream>
  2. #include <cstdlib>
  3. #include <cstring>
  4.  
  5. struct MStream {
  6.     void *data = nullptr;
  7.     int len = 0;
  8.    
  9.     MStream(char const *str) {
  10.         len = std::strlen(str);
  11.         data = std::malloc(len);
  12.         std::memcpy(data, &str[0], len);
  13.     }
  14.     MStream(MStream const &copy): len(copy.len) {
  15.         data = std::realloc(data, len);
  16.         std::memcpy(data, copy.data, len);
  17.         std::cout << "Copy\n";
  18.     }
  19.     MStream(MStream &&move): len(move.len) {
  20.         data = move.data;
  21.         move.len = 0;
  22.         move.data = nullptr;
  23.         std::cout << "Move\n";
  24.     }
  25.     ~MStream() {
  26.         if (data) {
  27.             std::free(data);
  28.         }
  29.     }
  30.     MStream &&operator =(char const *value) {
  31.         return std::move(MStream(value));
  32.     }
  33. };
  34.  
  35. int main() {
  36.     MStream m = "foo";
  37.     return 0;
  38. }
Does not. (in fact, even on O0 gcc optimizes the operator completely away, and the constructor is done inplace, which means not a single move or copy happens, but if the code would be complex enough that it would not be simply optimized away, it would result in a move operation, not a copy operation).

Sure you can also write equivalent code in pascal, by not using assignment operators and constructors for records, but use inplace functions instead. But this is a lot more effort. In general, if you want to write efficient code it is much more effort with pascal, than  it is with C++, as the language design emphasizes this differently. Therefore if you write good C++ code, it will also be efficient. If you write good Pascal code, you need to take extra steps to make it efficient.

Honestly, whenever I need to write very performance relevant code I don't even bother to use the fpc, because C++ code is easier to write efficiently, while also optimizing pascal code for such things as move semantics often makes the code less readable and therefore worse. FPC and Lazarus are great for some things, but I think I would go insane if I tried to get the same level of (manual) optimizations I get in C++ just by using this language as intendet, by chaning my pascal code.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 20, 2020, 06:20:53 pm
@Warfley: Are you sure your example is correct?

The Pascal code copies the content of the string only where you use "move". And in those locations the c code also uses "memcpy", which I am pretty sure copies the content.

In all other cases it is pointer operations.

In Pascal passing a string (longstring $H+) is a pointer op. Always!
Even assignment is. (Strings support copy on write, so they are not copied, until you modify the content. At that point there is no way avoiding the copy).

You do not even need "const s: string" for the param. A none-conts string param is still passed by pointer. (copy of the pointer / not pointer to pointer)



A lot of other data also works via pointer in Pascal:
- Objects (instances of classes, not old style object).
- Dyn Array

On the other hand records are passed by value. But you can specify "var" or "constref" depending on what you need.



Also in your c code the "string" is actually a PChar (if you want to match the data type exactly in pascal)

The big difference is that in Pascal you have some hidden pointers. In c strings and (dyn)array are usually done as pointer. In Pascal there is a data type, that abstracts that pointer from the users responsibility. Yet in Pascal you can do explicit Pointer too.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 20, 2020, 07:15:03 pm
@Warfley: Are you sure your example is correct?
Yes, it's about the copy operator that is overloaded. This is used when assigning variables of the same record type:
Code: Pascal  [Select][+][-]
  1. var
  2.   t1, t2: TMStream;
  3. begin
  4.   t2 := t1; // internally this will be compiled to TMStream.Copy(t1, t2);
  5. end;
Quote
A lot of other data also works via pointer in Pascal:
- Objects (instances of classes, not old style object).
- Dyn Array

On the other hand records are passed by value. But you can specify "var" or "constref" depending on what you need.
But this is the thing, with the management operators you can define your own assignment semantics to records. This is useful for having managed datatypes inside your records. There are multiple reasons why these are very useful, most importantly, it let's you write local datatypes that do not have the overhead of classes, and support things like operator overloading for example.

The problem here is that as soon as you use operator overloading, the amount of copies is really annoying. For example I wrote a gmp wrapper recently, here are some parts of the code:
Code: Pascal  [Select][+][-]
  1. class operator TAPInteger.Finalize(var a: TAPInteger);
  2. begin
  3.   mpz_clear(a.FData);
  4. end;
  5.  
  6. class operator TAPInteger.Copy(constref aSrc: TAPInteger;
  7.   var aDst: TAPInteger);
  8. begin
  9.   mpz_set(aDst.FData, PAPInteger(@aSrc)^.FData); // deepcopies the whole data
  10. end;
  11.  
  12. class operator TAPInteger.+(constref lhs: TAPInteger; constref
  13.   rhs: TAPInteger): TAPInteger;
  14. begin
  15.   mpz_add(Result.FData, PAPInteger(@lhs)^.FData, PAPInteger(@rhs)^.FData);
  16. end;
The following expression c := a + b; would create a temporary object that is used as result of the + operation, which is then copied into c using the copy operator.
I wanted to implement some crypto algorithms just out of interest, i.e. not write production ready code, therefore the copies are not a problem, but if you wanted to deploy this code in a server that has to be quick when establishing handshakes and stuff (as each copy would need to copy around 500 bytes), this would be a problem.
The C++ OOP implementation of the gmp uses for this the move semantic, i.e. a temporary object gets created, but the assignment to c only copies the pointer not the whole data.
To do this in pascal you would need to completely go without operator overloading. And personally I think this makes the code much worse. Just look at the RSA key generation:
Code: Pascal  [Select][+][-]
  1. p := TAPInteger.RandomPrime;
  2. q := TAPInteger.RandomPrime;
  3. m := p * q;
  4. phi := (p-1) * (q-1);
  5. pub := 65537;
  6. priv := pub.inverse(phi);
this is much better than writing the following:
Code: Pascal  [Select][+][-]
  1. mpz_init(p);
  2. mpz_init(q);
  3. mpz_init(m);
  4. mpz_init(phi);
  5. mpz_init_set_ui(pub, 65537);
  6. mpz_init(priv);
  7. generateRandomPrime(p);
  8. generateRandomPrime(q);
  9. mpz_mul(m, p, q);
  10. mpz_init(phi_p);
  11. mpz_init(phi_q);
  12. mpz_sub_ui(phi_p, p, 1);
  13. mpz_sub_ui(phi_q, q, 1);
  14. mpz_mul(phi, phi_p, phi_q);
  15. mpz_clear(phi_p);
  16. mpz_crear(phi_q);
  17. mpz_inverse(priv, pub, phi);
In C++ you can write code like the former with literally no drawbacks, in pascal if you need performance, you need to write the latter one if you don't want to loose a lot of performance due to copying.

PS: I should note that due to this bug (#37164) (https://bugs.freepascal.org/view.php?id=37164) (double call to the finalize operator in functions that return managed records) management operators are currently completely unusable (as they can simply not used as return values, and therefore constructors and operators are unusable) and my project actually didn't go anywhere because of this, but this is about conceptual designs of the language not bugs in the compiler. And still, even though when this bug gets fixed I will continue my projects, the code I write will not be production ready due to this massive overhead and I probably, if I ever need to use the GMP, will resort to C++ instead
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 20, 2020, 07:42:43 pm
@Warfley

I think Delphi 10.3 supports those as in C++
https://blogs.embarcadero.com/custom-managed-records-coming-in-delphi-10-3/

Not sure for now if FPC 3.2 could achrive this need to look..
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 20, 2020, 07:56:20 pm
@Warfley

I think Delphi 10.3 supports those as in C++
https://blogs.embarcadero.com/custom-managed-records-coming-in-delphi-10-3/

Not sure for now if FPC 3.2 could achrive this need to look..

Theoretically the fpc already supports them (but as mentioned above they are currently unusable due to a bug). But my point is that even considering this, it is missing the move semantic. Meaning you often copy data from temporary objects, instead of just grabbing their pointers. As these objects are temporary, they don't need that pointer afterwards, so you can save a lot of performance by doing so.

Example from C++:
Code: C  [Select][+][-]
  1. std::vector<int> v1{1,2,3,4}, v2;
  2. v2 = v1; // copies all data from v1
  3. v2 = std::move(v1); // makes v1 a temporary object, moves list from v1, v1 is now empty and v2 contains all the data from v1.
For example the result of a function is always a temporary object, meaning there is no point in copying data if you can move it instead

It should be noted that besides this C++ compilers generally often use return value optimization, so instead of creating a temporary object tha is returned by a function that is then moved or copied, the compiler will simply write into the target object if possible, ommiting any move or copy, which is an optimization the FPC could also greatly benefit from. But even without the move semantic makes handling complex datatypes via copy assignments much easier
Title: Re: How optimized is the FPC compiler
Post by: Thaddy on December 20, 2020, 08:04:57 pm
I strongly disagree, the language can emphasize more or less efficient programming. For example in C++ you can distinguish between move and copy semantic on assignments. For example if
Nonsense. FreePascal allows the same constructs, but with a slightly more complex syntax.
And C++ is a bad habit language anyway, so no wonder the Pascal solution is more complex.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 20, 2020, 08:10:43 pm
@Thaddy

can u make an example how it would be done in pascal? Im really interessted :P
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 20, 2020, 08:14:45 pm
IOW, it is not the language itself.
I strongly disagree, the language can emphasize more or less efficient programming. For example in C++ you can distinguish between move and copy semantic on assignments.

Sure, but IMHO that is in stuff in the fringes and does not justify the over-broad statement that you make.  You also don't really specify any numbers or scenarios where this matters.

I do however acknowledge that some things are (still) less polished.e.g. something like generics' TDictionary suffers from this being awkward for value types, making them hard to mutate during iteration.

I would chalk that up as a "more flexible STL implementation" win for C++ though, not performance.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 20, 2020, 08:52:57 pm
Sure, but IMHO that is in stuff in the fringes and does not justify the over-broad statement that you make.  You also don't really specify any numbers or scenarios where this matters.
See a few posts above, the gmp example. In general, when overloading operators that return complex types. Other example, I implemented a set type where union, complement, etc. can be expressed via arithmetic operators. Here the result of the operator functions always need to be copied.
I also often use implicit casts as seen in my APInteger example above. This often leads not only to one, but two uneccesary copies.

Note in C++ you can also overload the +=, -=, etc operators to do the operation inplace and don't need to create a temporary object at all, which also benefits things like a set implementation, as s += s2 would be the same as s.addall(s2)

Quote
Nonsense. FreePascal allows the same constructs, but with a slightly more complex syntax.
But this is all I said. In C++ simple code is often very efficient, see the GMP example from above, using the GMP classes with arithmetic operators is not less efficient than using the gmp low level api. But in pascal, to get a good performance you need to use the low level API, because move semantic is simply not part of the pascal language design.
I did not say that C++ is more efficient, I said it emphasizes more on writing efficient code. And you seem to agree, writing efficient code with pascal is more complex than it is in C++. So if you try too keep your code simple, in pascal it is often a tradeoff between simplicity and performance that in C++ is simply not the case.

And personally, I think as long as performance is not an issue you don't need to optimize your code, thats why I still often use pascal, because most of the time performance is not an issue. But this thread is about performance, and if you want to write highly performing programms, using C++ you can get much cleaner code that is highly efficient opposed to pascal, where such optimizations go at the cost of code complexity. And code being kept simple and readable is imho pretty much the single most important thing to consider when writing a program. So I won't use a language that requires me to write more complex code than neccessary for a given problem.

But I would not use the qualifier slightly. To take my example from above:
Code: Pascal  [Select][+][-]
  1.     p := TAPInteger.RandomPrime;
  2.     q := TAPInteger.RandomPrime;
  3.     m := p * q;
  4.     phi := (p-1) * (q-1);
  5.     pub := 65537;
  6.     priv := pub.inverse(phi);
This is not just slightly less complicated than:
Code: Pascal  [Select][+][-]
  1.     mpz_init(p);
  2.     mpz_init(q);
  3.     mpz_init(m);
  4.     mpz_init(phi);
  5.     mpz_init_set_ui(pub, 65537);
  6.     mpz_init(priv);
  7.     generateRandomPrime(p);
  8.     generateRandomPrime(q);
  9.     mpz_mul(m, p, q);
  10.     mpz_init(phi_p);
  11.     mpz_init(phi_q);
  12.     mpz_sub_ui(phi_p, p, 1);
  13.     mpz_sub_ui(phi_q, q, 1);
  14.     mpz_mul(phi, phi_p, phi_q);
  15.     mpz_clear(phi_p);
  16.     mpz_crear(phi_q);
  17.     mpz_inverse(priv, pub, phi);
The first one is clearly readable and easy to understand and write, while the second one is just worse in every regard. Even if you take away the initialization and clearing, as this can still be done using the management operators, something like:
Code: Pascal  [Select][+][-]
  1.     generateRandomPrime(p);
  2.     generateRandomPrime(q);
  3.     mpz_mul(m, p, q);
  4.     mpz_sub_ui(phi_p, p, 1);
  5.     mpz_sub_ui(phi_q, q, 1);
  6.     mpz_mul(phi, phi_p, phi_q);
  7.     mpz_set_ui(pub, 65537);
  8.     mpz_inverse(priv, pub, phi);
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 20, 2020, 09:52:35 pm
@Warfley: Are you sure your example is correct?
Yes, it's about the copy operator that is overloaded. This is used when assigning variables of the same record type:

Ah, sorry. The following from your original post
Quote
The whole contents of the string is copied twice
threw me off.

I applied that to the content, as long as it was in the string. But not when it was in the self-allocated mem.

Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 06:07:13 am
Im actually really abit wondered why these "move" semantics are not part of Pascal since Pascal was the Idea to replace (even if not succesfully done so) C/C++ family by providing exactly those kind of native/low-lvl intrinsics/API's but expose them in a cleaner and more maintainable way.

So actually this would be really nice to have FPC support also the Move-keyword ("std::Move") and possibility to overload the "+= ", "-="

Even delphi 10.3 allowed the possibility of Copy-Constructors where I think alot of other languages deny those constructs, thats why delphi/FPC shall compete eye-to-eye with C/C++ in such regards, as they do with other constructs, like :

* unions
* general Pointer-arithmetic
* embeded Assembler code

etc..

It has those constructs for a reason, so addingg also  the copy-constructor, move-semantics and +=/-= overloads would mean alot for more efficient coding, I agree on this term aswell as @Warfley
Title: Re: How optimized is the FPC compiler
Post by: Thaddy on December 21, 2020, 10:11:58 am
Quote
* unions
* general Pointer-arithmetic
* embeded Assembler code
* Variant record fields
* {$pointermath on}
* inline assembler is fully supported on many targets.
Also note Pascal is older than C (1970  vs 1972  )
Title: Re: How optimized is the FPC compiler
Post by: Awkward on December 21, 2020, 10:12:48 am
Shpend, if you want so much C/C++ things, maybe better to use C/C++ compiler and do not try transform Pascal to C++ ?
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 21, 2020, 11:05:21 am
Shpend, if you want so much C/C++ things, maybe better to use C/C++ compiler and do not try transform Pascal to C++ ?

Well, before generics, operator overloading was used for a few special cases like the ubiquitous TComplex record, but with generics, the number of applications becomes larger. C++ is generally more apt in using value types, and Delphi generics have large gaps there.

FPC 3.2.0 got the record management stuff, so such changes are not completely out of the question.

I write speed dependent (Vision) applications, but don't use (or have an use case) for the examples that Warfley gives at all, so his general tenure (and subject) "not fit for performance applications" based on a few details ticked me off heavily. But I'm glad that now the discussion is constructive again.

I wonder how many of these are already solvable using management operators though.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 11:29:32 am
@marcov

would you actually think that my mentioned things to add to FPC, like the "std::move" and "+=/-=" would make it to FPC? I think FPC can with those changes reallyy compete heavily with C++ and guys, am I wrongthinking that Object Pascal was made to compete vs C++ and Plain Pascal was about to compete with C? I think yes, so adding more record control, is highhly appreciated also for embeded Systems I think or also Engine(Low lvl) Programms, imho these are very endorsed features, i mean we are talking here about a language which is in itself beautfifully designed and has nearly everything C++ offers in terms of Language-Constructs so why not add those to complete it :P
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 11:34:19 am
Quote
* unions
* general Pointer-arithmetic
* embeded Assembler code
* Variant record fields
* {$pointermath on}
* inline assembler is fully supported on many targets.
Also note Pascal is older than C (1970  vs 1972  )
Btw, you only wrote what i already mentioned, @thaddy :D My argument stated that I love that FPC has already those things and would highly benefit the other constructs C++ offer, honestly
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 11:38:35 am
I write speed dependent (Vision) applications, but don't use (or have an use case) for the examples that Warfley gives at all

This is not an arguement mate, if this would be the case, a C# 2D engine i recently saw its src-code (was written fully in .NET 4.0, even having acess to core 3.1 LTS) would mean for .NET theyy dont have to do any optimizations or offer any language feature because they already are not apparently of much importance due to the existence of that 2D engine, cuz they didnt need apparently any higher language feature, I m not a big fan of those type of arguments tbh..
Title: Re: How optimized is the FPC compiler
Post by: lucamar on December 21, 2020, 12:23:48 pm
am I wrongthinking that Object Pascal was made to compete vs C++ and Plain Pascal was about to compete with C?

Yes, that's wrong. Pascal was created as a (kind of) substitute to Algol, when Wirth got tickled because the Algol comitte didn't accept his Algol-W extensions/simplifications for the second Algol Standard (later Algol 68), and because he wanted a better structured language for teaching. Nothing to do with C at all, other than they are "cousin" descendants of Algol, and both were created at around the same time (early 70s, though IIRC Pascal was created/published before C).

Later, in the mid-80s, Apple needed object extensions to Pascal for their then new Macintosh computer (and the previous Lisa) so they added them with inspiration from various OO languages (like SmallTalk). OOP was all the rage then so in the late 80s Borland (and others) took features from both (Mac) Object Pascal and the early attempts by Stroustrup at a C++, though they took abstract concepts rather than concrete paradigms/syntax from the later.

You can find a brief (and not very exact) history of Object Pascal and a slightly better one of Pascal in the Wikipedia. ;)
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 21, 2020, 01:27:44 pm
Im actually really abit wondered why these "move" semantics are not part of Pascal since Pascal was the Idea to replace (even if not succesfully done so) C/C++ family by providing exactly those kind of native/low-lvl intrinsics/API's but expose them in a cleaner and more maintainable way.

The management operators were only released with 3.2.0. They are a relatively new feature. And Pascal as such does not know the concept of move semantics, thus there was no need to include them in the concept.

So actually this would be really nice to have FPC support also the Move-keyword ("std::Move")

If we can determine clear rules for the compiler when move semantics should be used then one can talk about it. Though even then FPC is a project developed by volunteers in their free time. If none of the devs should be interested in that... though luck... *shrugs*

and possibility to overload the "+= ", "-="

These operators are simply syntactic sugar and nothing more.
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 21, 2020, 01:45:35 pm
would you actually think that my mentioned things to add to FPC, like the "std::move" and "+=/-=" would make it to FPC?

I'm no compiler implementer. I just agree based on own practice that the Delphi/FPC dialect has some weaknesses in efficient valuetype handling in some constructs. Also it is not just about final code efficiency, but also syntax related oddities like most container types having an iterator type that is defined by value, making it impossible to mutate additional fields in  a for..in loop, since the loop/iterator var is a copy.

I don't know if that warrants extensions, and if so with which priority. I also are deliberate vague on the form of the extension, which is a reasonable caution if you look at e.g. the std::move (https://en.cppreference.com/w/cpp/utility/move)  definition with a "class" as argument, which is a reference type in FPC.

So a lot more research and usecases would need to be presented than just mumbling "std::move" or "copy from C++" or droning on about performance (which it is only for fairly small class of performance requiring applications).

I just don't discount the notation that sooner or later something needs to be done there for (more) efficient valuetype processing in some constructs (the operator overloading as commented by Warfley, and my own preferences which are more STL/generics  oriented)

But that discussion should be fact based and to the points, and even the the question remains who would implement it.
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 21, 2020, 01:53:36 pm
This is not an arguement mate, if this would be the case, a C# 2D engine i recently saw its src-code (was written fully in .NET 4.0, even having acess to core 3.1 LTS) would mean for .NET theyy dont have to do any optimizations or offer any language feature because they already are not apparently of much importance due to the existence of that 2D engine, cuz they didnt need apparently any higher language feature, I m not a big fan of those type of arguments tbh..

Then why isn't this the case for C++ too ? Since this kind of stuff is likely to be the rate determining step for a very small group. (and even they could simply write it out, or generate it).

I'm somewhat similar to the C# example in that the first primitives to operate on an image are by far the most rate determining step if other factors are somewhat decently (but not spectacularly complicated or fancy).

So in most of my code I can use HLL code just fine, just not on every pixel.

Some exceptions to that are blob (which probably would be much faster with a better compiler as it is a very complex loop) and the mixed radix FFT routine, which I use for filtering all incoming data in some routines. I have some SSE code for that but it is not live yet.

Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 02:04:14 pm
I understand @PascalDragon, I do respect and fully aknowledge ofc that FPC has not a big company behind it, and thus it has become a great compiler, indeed, Im only saying that FPC can , IMHO, easily compete in a language to language "battle" with C++ based on those relatively minor additions, since FPC is kind of already very similar to C++ in terms of capabilty, I guess at least.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 21, 2020, 04:22:29 pm

am I wrongthinking that Object Pascal was made to compete vs C++ and Plain Pascal was about to compete with C?
I don't think object pascal is made to compete with C++, at least not what was implemented in Delphi over the time.
Classes in particular are weird if you look at pascals history. They are always placed on the heap, always referenced by pointers, but completely hide this from the user but still require manual memory management, even though this is completely different from regular memory management (new, dispose) used by oldschool objects.
These classes and the subsequent implementation of the RTL an VCL using classes, results more in Dephi getting much more similar to Java than to C++.

From a performance programming perspective this makes absolutely no sense. For example using a TStringList to split a string requires the additional allocation of the class on the heap. With an old style object, the required variables that the stringlist uses internally would be placed on the stack and no overhead would be gained than if no OOP was used.
Classes always bring additional overhead, and due to manual memory management. Also the standard libraries heavily make use of abstract base classes and inheritance with virtual methods and stuff. In fact some methods like the destructor must be always virtual. If you look at C++ this is not the case. Sure the C++ standard library implementations also makes heavy use of inheritance for reducing the code complexity, but in most of the classes like vector, set, etc. you won't find any virtual methods. More often than not virtual methods are avoided using the CRTP (https://www.wikiwand.com/en/Curiously_recurring_template_pattern) idiom which allows for static or bounded polymorphism (https://www.wikiwand.com/en/Template_metaprogramming#/Static_polymorphism) using templates. This limits or completely avoids virtual call chains and makes a lot of the code inlinable and getting rid of virtual table jumps.
Another thing I found about C++ is the heavy usage of templates (generics in pascal) for class configuration. For example, to implement a custom sorting algorithm for a TStringList, you set a function pointer in the TStringList instance. In C++ you configure such things via template parameters. This has the consequence, that in Pascal this code can not be optimized, as the compiler does not know at compiletime what function will be used, while in C++ this is a complete runtime decision.

With all of that, I don't think that Delphi is (anymore) really a competitor to C++, it is much more going into the Java direction. This is btw IMHO nothing wrong. Java is a successful language because it makes a lot of things very easy, so does Delphi. CRTP is much more complicated than classical inheritance, C++ templates are turing complete, which makes them great to put computations into the compiletime and give the optimizer more to work with, but simulatniously template programming is really complicated.
If this is a good or a bad thing is something one can evaluate for himself. If you don't need maximum efficiency, this kind of stuff would require much more effort for people to learn and use Pascal. And still, it is not hard to use a C++ library in Pascal, so one can simply use the language best fitted for the task.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 21, 2020, 04:35:41 pm
I wonder how many of these are already solvable using management operators though.
Since the release of FPC 3.2 I am actually working on several projects where I try to use them for creating more efficient alternatives to things commonly done using classes, and personally I think that they are a great feature that allow for a lot of new high level expressions that where impossible beforehand.

But sadly, as I already mentioned before, due to this bug (https://bugs.freepascal.org/view.php?id=37164) they are simply not usable currently, at least not in situations where you rely on the finalization, as when functions return a managed record it get's freed while still being active. This can result in double free or use after free, etc. and basically means that for anything that needs to be managed, the management operators are not usable. So all these projects of mine are currently on hold.

But, using managed records instead of classes with generics for simple inheritance, I already built some types that allow for pretty nice highly efficient usage. So yeah, theoretically there is a lot of stuff one can do with management operators.

One example is the use of enumerators, why are enumerators so often implemented as classes? If you enumerate only a few elements, the creating and freeing of the class makes a massive performance difference. Using managed records (or in many cases normal records are enough), can massively improve the performance if you have few elements and a small loop body (i.e. the memory management dominates the runtime)
Title: Re: How optimized is the FPC compiler
Post by: nanobit on December 21, 2020, 05:29:13 pm
I would really love to see how is the current state of the FPC, what could be done better and are ppl interessted in  doing so)

My observation over the last few years (earlier I don't know) is:
A lot of time is spent on level 3+ optimization. But the real question is how many actually use it.
Personally, in order to minimize the number of bugs (which is my top priority),
I don't dare to use more than level 2 and I'm happy with that approach.

In addition, FPC allows to write SSE algorithms or to use external libraries.
For SSE I would take a look at https://ispc.github.io/index.html and build a dll
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 21, 2020, 05:40:12 pm
Ok thx for the clarification :)

Yea i see ur point @Warfley, delphi and subsequently also FPC orients more to C#/Java based language with a tend to C++, like sort of a hybrid, but I really still think thats it doesnt hurt the language if some effective C++ possibilities (like for instance this entire managed records stuff and move semantics..) to allow for kind of efficient code, aswell if need be, since i think that FPC is used as i saw abit back then, in emulators aand kind of LowLvl codee which could benefit from those optimizations.

@nanobit
yea will do that:

BTW!: is it actually possible, to just build a wrapper around in C++ which wrapps the entire (or whast is needed from it.. "STD::"? and use it then in pascal?
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 21, 2020, 06:12:08 pm
These operators are simply syntactic sugar and nothing more.
Sure, this *is* the case, but the question is *should* that be the case?

There are clear advantages of having separate +=, -=, etc. operators, as this does not need a temporary object. And more often than not can be done inplace.
Example string: str += str2 can either be implemented as:
Code: Pascal  [Select][+][-]
  1. setlength(tmp, length(str) + length(str2));
  2. move(str[1], tmp[1], length(str));
  3. move(str2[1], tmp[length(str) + 1], length(str2));
  4. str := tmp;
or implemented as:
Code: Pascal  [Select][+][-]
  1. oldlen := length(str);
  2. setlength(str, oldlen + length(str2));
  3. move(str2[1], tmp[oldlen + 1], length(str2));
In the worst case, the latter code is equivalent to the former (if the memory manager can't simply append enough space, or the refcount is > 1) But in a lot of cases the latter one is massively more efficient, as it avoids a complete string copy.

Therefore even on standard types, having these as seperate operators can have massive benefits. I honestly don't see why this shouldn't be implemented.

but I really still think thats it doesnt hurt the language if some effective C++ possibilities (like for instance this entire managed records stuff and move semantics..) to allow for kind of efficient code

I agree, this post was simply a statement on what Delphi is and why it is so different in so many regards from for example C++. These were deliberate design decisions. Delphi simply was not made to be like C++. This of course does not imply in any way that in the future it can not use ideas from that language. Management operators, especially the way they are implemented in Delphi (using constructors, destructors and the assign operator) are an example for this
Title: Re: How optimized is the FPC compiler
Post by: Awkward on December 21, 2020, 06:46:42 pm
Warfley, didn't you noticed what your code example with strings are different on your "level" only. but still almost the same on lowlevel? when you will change size of string, it will combine new string allocation and copying inside memorymanager anyway.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 21, 2020, 07:03:28 pm
Warfley, didn't you noticed what your code example with strings are different on your "level" only. but still almost the same on lowlevel? when you will change size of string, it will combine new string allocation and copying inside memorymanager anyway.
It only does so if it can not extend the string, as I said
In the worst case, the latter code is equivalent to the former (if the memory manager can't simply append enough space, or the refcount is > 1) But in a lot of cases the latter one is massively more efficient, as it avoids a complete string copy.
If the refcount of the string is 1, and there is free memory behind that block, the MM will simply extend this block. Also, I don't know if the FPC MM does this, but many memory managers, to avoid fragmentation, overalloc memory so it fits a certain size (e.g. a multiple of 16 bytes). If that is the case, adding less bytes than are overallocated is actually completely free.

My example is only in the worst case as bad as the solution using a temporary storage. But in a lot of cases it can perform massively better
Title: Re: How optimized is the FPC compiler
Post by: BeniBela on December 21, 2020, 11:36:56 pm
So actually this would be really nice to have FPC support also the Move-keyword ("std::Move")

Oh no, that is incredible confusing in C++

There has to be a way to do it with a less  confusing syntax.

In practice, you can get move semantic in Pascal with assigning default() to the target, Move source to target, and FillChar on the source



I just agree based on own practice that the Delphi/FPC dialect has some weaknesses in efficient valuetype handling in some constructs. Also it is not just about final code efficiency, but also syntax related oddities like most container types having an iterator type that is defined by value, making it impossible to mutate additional fields in  a for..in loop, since the loop/iterator var is a copy

The worst is when the copy updates a reference count


In many of my collections the enumerator returns a pointer to the data

These classes and the subsequent implementation of the RTL an VCL using classes, results more in Dephi getting much more similar to Java than to C++.

From a performance programming perspective this makes absolutely no sense. For example using a TStringList to split a string requires the additional allocation of the class on the heap. With an old style object, the required variables that the stringlist uses internally would be placed on the stack and no overhead would be gained than if no OOP was used.
Classes always bring additional overhead, and due to manual memory management.

Worst thing Borland ever did

Another thing I found about C++ is the heavy usage of templates (generics in pascal) for class configuration. For example, to implement a custom sorting algorithm for a TStringList, you set a function pointer in the TStringList instance. In C++ you configure such things via template parameters. This has the consequence, that in Pascal this code can not be optimized, as the compiler does not know at compiletime what function will be used, while in C++ this is a complete runtime decision.

Even if it was configured as template/generics parameters, FreePascal could not optimize it, since it is not so good as optimizing anything


Since the release of FPC 3.2 I am actually working on several projects where I try to use them for creating more efficient alternatives to things commonly done using classes, and personally I think that they are a great feature that allow for a lot of new high level expressions that where impossible beforehand.

Me too!

But sadly, as I already mentioned before, due to this bug (https://bugs.freepascal.org/view.php?id=37164) they are simply not usable currently, at least not in situations where you rely on the finalization, as when functions return a managed record it get's freed while still being active. This can result in double free or use after free, etc. and basically means that for anything that needs to be managed, the management operators are not usable. So all these projects of mine are currently on hold.

6 months old? They really should have fixed it by now

A lot of time is spent on level 3+ optimization. But the real question is how many actually use it.
Personally, in order to minimize the number of bugs (which is my top priority),
I don't dare to use more than level 2 and I'm happy with that approach.

I mostly use level 1 or 2, since level 3 has crashed far too often. Especially on arm
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 22, 2020, 02:02:37 pm
From a performance programming perspective this makes absolutely no sense. For example using a TStringList to split a string requires the additional allocation of the class on the heap. With an old style object, the required variables that the stringlist uses internally would be placed on the stack and no overhead would be gained than if no OOP was used.
Classes always bring additional overhead, and due to manual memory management. Also the standard libraries heavily make use of abstract base classes and inheritance with virtual methods and stuff. In fact some methods like the destructor must be always virtual. If you look at C++ this is not the case. Sure the C++ standard library implementations also makes heavy use of inheritance for reducing the code complexity, but in most of the classes like vector, set, etc. you won't find any virtual methods. More often than not virtual methods are avoided using the CRTP (https://www.wikiwand.com/en/Curiously_recurring_template_pattern) idiom which allows for static or bounded polymorphism (https://www.wikiwand.com/en/Template_metaprogramming#/Static_polymorphism) using templates. This limits or completely avoids virtual call chains and makes a lot of the code inlinable and getting rid of virtual table jumps.

The virtual method system provided with Delphi-style classes is the strength of that. It's quite often when I long for the capabilities of FPC's classes when working in C++ at work. And for this strength to play out you can't place the classes on the stack which was probably the main reason that Borland decided that classes live solely on the heap.

These operators are simply syntactic sugar and nothing more.
Sure, this *is* the case, but the question is *should* that be the case?

Yes, because nowadays we wouldn't even add them to the language anymore. They are simply there, because someone in the past when we didn't have a clear(er) picture in mind for (Object) Pascal, added them (as well as << and >> ::) ).

But sadly, as I already mentioned before, due to this bug (https://bugs.freepascal.org/view.php?id=37164) they are simply not usable currently, at least not in situations where you rely on the finalization, as when functions return a managed record it get's freed while still being active. This can result in double free or use after free, etc. and basically means that for anything that needs to be managed, the management operators are not usable. So all these projects of mine are currently on hold.

6 months old? They really should have fixed it by now

As written often enough already: FPC is developed by volunteers in their free time. When I don't have the time or interest to investigate a bug, well, though luck.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 08:36:21 am
But what im really curious at, is , what are then "object" classes like, are they not supposed to be C++ art-classes, where they can live on the stack and still have polymophism, inheritance etc... available?

Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 03:22:29 pm
Well, they are dead. Except for inheritance advanced records can do the same but better. There also seems to be no interest in further development on them neither here in the FPC community nor in Delphi
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 03:33:04 pm
But what im really curious at, is , what are then "object" classes like, are they not supposed to be C++ art-classes, where they can live on the stack and still have polymophism, inheritance etc... available?

They are a "limited attempt"....

Just as records, they do not allocate heap memory, but rather live on the stack. And while they have some support for inheritance, that requires that the subclasses have no additional fields. (You can afaik do virtual / overwritten stuff)

Imagine
Code: Pascal  [Select][+][-]
  1. object TBase
  2.   a: integer;
  3.   // add your constructor and methods
  4. end;
  5. object TAdvanced(TBase)
  6.   extra: Double;
  7. end;
  8.  
  9. Function Bar: TAdvanced;
  10. begin
  11.   result.Create;
  12. end;
  13.  
  14. procedure Foo;
  15. var b: TBase;
  16. begin
  17.   b := Bar();
  18. end;
  19.  

b only has space for the fields of TBase.
So in the above example the remainder gets cut off. (Even if you cast back to TAdvanced, it will not come back. It is lost forever / It may even crash if you try to access extra).
B.method() would still call TAdvanced.method, if that was virtual/overwritten.

That is why new classes are on the heap => The caller does not need to know how much extra mem may have been required.





Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 03:52:52 pm
This is not true, with regards to the "basic" inheritance (i.e. no interfaces), objects have the same capabilities as classes. Classes just add a coat of syntactic sugar to it:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   SysUtils;
  7.  
  8. type
  9.  
  10.   { TBase }
  11.  
  12.   PBase = ^TBase;
  13.   TBase = object
  14.   private
  15.     constructor Init(const AValue: Integer);
  16.     destructor Destroy; virtual;
  17.   public
  18.     A: Integer;
  19.     function Print: String; virtual;
  20.     class function Create(AValue: Integer): PBase; static;
  21.     procedure Free;
  22.   end;
  23.  
  24.   { TChild }
  25.  
  26.   PChild = ^TChild;
  27.   TChild = object(TBase)
  28.   private
  29.     constructor Init(const AValue, BValue: Integer);
  30.   public
  31.     B: Integer;
  32.     function Print: String; virtual;
  33.     class function Create(AValue, BValue: Integer): PChild; static;
  34.   end;
  35.  
  36. constructor TBase.Init(const AValue: Integer);
  37. begin
  38.   Self.A:=AValue;
  39. end;
  40.  
  41. destructor TBase.Destroy;
  42. begin
  43.   // Nothing to do here
  44. end;
  45.  
  46. function TBase.Print: String;
  47. begin
  48.   Result := 'A: ' + A.ToString;
  49. end;
  50.  
  51. class function TBase.Create(AValue: Integer): PBase;
  52. begin
  53.   Result := GetMem(SizeOf(TBase));
  54.   Result^.Init(AValue);
  55. end;
  56.  
  57. procedure TBase.Free;
  58. begin
  59.   Destroy;
  60.   FreeMem(@Self);
  61. end;
  62.  
  63. { TChild }
  64.  
  65. constructor TChild.Init(const AValue, BValue: Integer);
  66. begin
  67.   inherited Init(AValue);
  68.   Self.B := BValue;
  69. end;
  70.  
  71. function TChild.Print: String;
  72. begin
  73.   Result:=inherited Print + ' B: ' + B.ToString;
  74. end;
  75.  
  76. class function TChild.Create(AValue, BValue: Integer): PChild;
  77. begin
  78.   Result := GetMem(SizeOf(TChild));
  79.   Result^.Init(AValue, BValue);
  80. end;
  81.  
  82. var
  83.   t1, t2: PBase;
  84. begin
  85.   t1 := TBase.Create(42);
  86.   t2 := TChild.Create(42, 32);
  87.   WriteLn(t1^.Print);
  88.   WriteLn(t2^.Print);
  89.   t1^.Free;
  90.   t2^.Free;
  91.   ReadLn;
  92. end.
  93.  

With adding {$ModeSwitch autoderef} the usage of the objects even simplifies to:
Code: Pascal  [Select][+][-]
  1. var
  2.   t1, t2: PBase;
  3. begin
  4.   t1 := TBase.Create(42);
  5.   t2 := TChild.Create(42, 32);
  6.   WriteLn(t1.Print);
  7.   WriteLn(t2.Print);
  8.   t1.Free;
  9.   t2.Free;
  10.   ReadLn;
  11. end.

This is exactly why I like objects so much. Sure they are a little bit more cumbersome, but with a little bit of syntactic sugar, they would be pretty much strictly better than classes, because they simply can do more
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 04:07:38 pm
Guys, i gotta admit, im confused, if its true what @Warfley says, i dont see any reason why to use then advanced records IF you need polymorphism and inheritance and ofc, they must be on stack..
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 04:13:28 pm
And @Warfley can u do this taken from ur example:

Code: Pascal  [Select][+][-]
  1.  
  2. var
  3.   t1, t2: PBase;
  4. begin
  5.   t1 := TBase.Create(42);
  6.   t2 := TChild.Create(42, 32);
  7.  
  8.   t1 := t2 as TBase;
  9.   t1.A := 100;
  10.   t1.B := 101; {should give Error!}
  11.  
  12.   WriteLn(t1^.Print);
  13.   WriteLn(t2^.Print);
  14.   t1^.Free;
  15.   t2^.Free;
  16.   ReadLn;
  17. end.
  18.  
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 04:22:25 pm
Because objects have not been updated in a long time, for example you can not use class operators, which means if you want to have operators you better not use generics. Management operators also are not implemented for objects.
Records are in everything but inheritance much better than objects. Personally I came up with some alternatives to implement polymorphism. Generics can help (I.e. a base that gets the child as generic argument and calls it's methods instead, with some noughty casts between unrelated pointer types), or manually building a vtable:
Code: Pascal  [Select][+][-]
  1.   PStreamVMT = ^TStreamVMT;
  2.   TStreamVMT = record
  3.     Read: Pointer;
  4.     Write: Pointer;
  5.     Position: Pointer;
  6.     Size: Pointer;
  7.     Seek: Pointer;
  8.     EOF: Pointer;
  9.   end;
  10.  
  11.   { TRStream }
  12.  
  13.   TRStream = record
  14.   private
  15.     FVMT: PStreamVMT;
  16.     FStream: Pointer;
  17.   public
  18.     function Read(out ADst; Size: SizeInt): SizeInt; {$IFDEF INLINING}inline;{$ENDIF}
  19.     function Position: SizeInt; {$IFDEF INLINING}inline;{$ENDIF}
  20.     function Size: SizeInt; {$IFDEF INLINING}inline;{$ENDIF}
  21.     procedure Seek(APosition: SizeInt; Mode: TSeekMode = smAbsoluteFront); {$IFDEF INLINING}inline;{$ENDIF}
  22.     function EOF: Boolean; {$IFDEF INLINING}inline;{$ENDIF}
  23.  
  24.     property RawPointer: Pointer read FStream;
  25.  
  26.     constructor Create(const AVMT: PStreamVMT; const AStream: Pointer);
  27.  
  28. ...
  29.  
  30. function TRStream.Read(out ADst; Size: SizeInt): SizeInt;
  31. var
  32.   m: TMethod;
  33. begin
  34.   m.Code := FVMT.Read;
  35.   m.Data := FStream;
  36.   Result := TStreamReadMethod(m)(ADst, Size);
  37. end;
  38.  
  39. function TRStream.Position: SizeInt;
  40. var
  41.   m: TMethod;
  42. begin
  43.   m.Code := FVMT.Position;
  44.   m.Data := FStream;
  45.   Result := TStreamPositionMethod(m)();
  46. end;
  47.  
  48. ...

With a stream implementation then having the following implementation:
Code: Pascal  [Select][+][-]
  1. function TCURRENTCLASSNAME.GetInterface: TRStream;
  2. const
  3.   vmt: TStreamVMT = (Read: @TCURRENTCLASSNAME.Read;
  4.                      Write: nil;
  5.                      Position: @TCURRENTCLASSNAME.Position;
  6.                      Size: @TCURRENTCLASSNAME.Size;
  7.                      Seek: @TCURRENTCLASSNAME.Seek;
  8.                      EOF: @TCURRENTCLASSNAME.EOF);
  9. begin
  10.   Result := TRStream.Create(@vmt, @self);
  11. end;
  12.  
Which is in an include file that is included like this:
Code: Pascal  [Select][+][-]
  1. {$Macro On}
  2. {$Define TCURRENTCLASSNAME:=TFileReader}
  3. {$Include ../internal/RStreamB.inc}
  4. {$UnDef TCURRENTCLASSNAME}
  5. {$Macro Off}
  6.  

Pretty cumbersome but works
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 04:30:15 pm
And @Warfley can u do this taken from ur example:
The as operator does not work for objects, as this is syntacic sugar added to classes. I don't even know if RTTI works for objects, but if it does, it should be possible to manually implement this with a function or possibly even by overloading the as operator (if that is possible, I don't know right now).

But you can do something like this:
Code: Pascal  [Select][+][-]
  1. var
  2.   b: PBase;
  3.   c: PChild;
  4. begin
  5.   c := TChild.Create(42, 32);
  6.   b := c; // no operator needed, the compiler recognizes the relationship between those pointers
  7.   b^.A := 52;
  8.   b^.B := 22; // of course throws an error because TBase has no B
  9.   b^.Free;
  10. end.
You only need a cast (which for classes could be done with as) when you cast upwards
Code: Pascal  [Select][+][-]
  1. var
  2.   b: PBase;
  3.   c: PChild;
  4. begin
  5.   b := TChild.Create(42, 32);
  6.   c := PChild(b);
  7.   c^.A := 52;
  8.   c^.B := 22;
  9.   b^.Free;
  10. end.
As I said, all the simple inheritance stuff is possible, you are just missing a lot of syntactic sugar that makes classes so neat to use
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 04:35:52 pm
Hmm, why dont they just not overthrow compelty objects and allow FULLY  everything a class can, only  by the difference of being on the stack rather than the heap? And even better, if you , for some reason dont want advanced record to function as stack-classes, well just do not include a custom-macro (the name is mine now, but u get the idea..)

So: if u want stack-classes (everything a class can, but fully on the  STACK, use:
 
Code: Pascal  [Select][+][-]
  1.   {$StackClasses On}
  2.   {$AdvancedRecords}
  3.    if u dont want the object behaviour, remove the "{$StackClasses On}" macro.
  4.  

And for instance, if FPC  would then support (ofc in a more stable version, with management records fully available) + move-semantics,
Title: Re: How optimized is the FPC compiler
Post by: Handoko on December 23, 2020, 04:53:34 pm
why dont they just not overthrow compelty objects and allow FULLY  everything a class can, only  by the difference of being on the stack rather than the heap?

I used objects long time ago. I even wrote my own version of TV using objects. I was so glad to knew object, it's really powerful. But then I saw everyone abandoned object and start using class. I didn't know why, information was hard to get and my English was not good to understand their explanation.

You can understand better object vs class here:
https://forum.lazarus.freepascal.org/index.php/topic,43622.0.html (https://forum.lazarus.freepascal.org/index.php/topic,43622.0.html)
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 23, 2020, 05:21:11 pm
Hmm, why dont they just not overthrow compelty objects and allow FULLY  everything a class can, only  by the difference of being on the stack rather than the heap? And even better, if you , for some reason dont want advanced record to function as stack-classes, well just do not include a custom-macro (the name is mine now, but u get the idea..)

Simple: the reason is backwards and Delphi-compatibility. This will trump any idea to rework the structured types (aka class, object, record).
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 05:31:26 pm
Ahh yea i totally forgot  that:

Ok just for me, since im abit confused here now with the diffs of Objects vs Classes:

I wanna summarize, if smth is incorrect pls correct the proper numbers!

1) Object support fully inheritance
2) Object is placed fully on stack and thus dont need any alloc/dealloc
3) Object CANNOT overload operators
4) Object CANNOT  use Generics
5) Object CANNOT use Management operators
6) Object CANNOT use Polymorphism

is this correct?
Title: Re: How optimized is the FPC compiler
Post by: Thaddy on December 23, 2020, 06:27:04 pm
No. You are trolling. Except  5
1,2,3,4,6 are supported.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 06:30:28 pm
This is not true, with regards to the "basic" inheritance (i.e. no interfaces), objects have the same capabilities as classes. Classes just add a coat of syntactic sugar to it:

Yes, but then (your code example) you allocate the memory on the HEAP.

Title: Re: How optimized is the FPC compiler
Post by: Thaddy on December 23, 2020, 06:33:21 pm
That is a common mistake, - not yours - objets are stack oriented and classes are always heap....
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 06:47:21 pm
No. You are trolling. Except  5
1,2,3,4,6 are supported.
I dont know guys.. everyone is telling smth else..

Warfley told, they cannot use polymorphism and hence he wrote some Polymorphism himself ?
Same he told, is not possible for overloading operators? (is, as and all otherstuff) same for Generics

Quote
Because objects have not been updated in a long time, for example you can not use class operators, which means if you want to have operators you better not use generics. Management operators also are not implemented for objects.
Records are in everything but inheritance much better than objects. Personally I came up with some alternatives to implement polymorphism
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 06:57:06 pm
"Polymorphism" is partly supported.

You can pass TFoo when TFooBase is expected. TFoo the inherited class is accepted.

But it only works, if either
- Both have the same memory size.
- You add your own memory management.

For all your other points, I would have to check (docs, or trial and error) what  the latest 3.2 (or even trunk) does support.

Anyway limitations to most of the points you have, are a questions as to whether they are implemented in FPC or not.
The above polymorphism limit is within the very design of (old style) objects. It will always exist. It can not be overcome within the definition of what those objects are.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 07:08:59 pm
"Polymorphism" is partly supported.

You can pass TFoo when TFooBase is expected. TFoo the inherited class is accepted.

But it only works, if either
- Both have the same memory size.
- You add your own memory management.

For all your other points, I would have to check (docs, or trial and error) what  the latest 3.2 (or even trunk) does support.

Anyway limitations to most of the points you have, are a questions as to whether they are implemented in FPC or not.
The above polymorphism limit is within the very design of (old style) objects. It will always exist. It can not be overcome within the definition of what those objects are.

And what is the case for operators?
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 07:16:08 pm
And what is the case for operators?
As I said, I would need to check myself.

Just test it yourself. Should be taking a minute only.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 23, 2020, 08:12:43 pm
I testted it with this code:

Code: Pascal  [Select][+][-]
  1. type
  2.   TTestObj = object
  3.     public A: integer;
  4.   end;
  5.  
  6. operator = (z1, z2: TTestObj) b : boolean;
  7. begin
  8.   result := z1.A = z2.A;
  9. end;
  10.  
  11. operator + (z1, z2: TTestObj) b : integer;
  12. begin
  13.   result := z1.A + z2.A;
  14. end;
  15.  
  16. var
  17.   x,y: TTestObj ;
  18.  
  19. begin
  20.   x.A:= 100;
  21.   y.A:= 323;
  22.   writeln(x + y);
  23.   readln;
  24. end.
  25.  

seems to work, im impressed :O
If FPC team would allow the full capability of Management operators for those old style objects, there (again, if you guys decide to add the move-semantics, tho in a more beautiful way than in c++ than add them also to objects) and you have like completly complete freedom/control of memory and efficiency IMHO , this would allow for really efficient which can like really compete with C++ at its finest.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 09:15:28 pm
Yes, but then (your code example) you allocate the memory on the HEAP.
The type of allocation does not matter, what matters is that you access via pointers. Example (same types as used above):
Code: Pascal  [Select][+][-]
  1. procedure PrintObj(x: PBase);
  2. begin
  3.   WriteLn(x.Print);
  4. end;
  5.  
  6. var
  7.   c: TChild;
  8. begin
  9.   c.init(42, 32);
  10.   printObj(c);
  11.   c.Destroy;
  12. end;

C is clearly not allocated on the heap, but polymorphism still works completely. It is not where it is allocated, it's about how it is accessed. Classes are always accessed as a pointer and provide a lot of syntactic sugar to hide this. Objects don't have the syntacitc sugar, but if referenced by pointer they are as capable as classes are.

Just for good measure, you can also put classes on the stack, even though it is quite hacky:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4. uses
  5.   Classes;
  6.  
  7. type
  8.   TDestroyMethod = procedure() of object;
  9.  
  10. var
  11.   Buffer: Array[0..1024] of Byte; // should be enough...
  12.   sl: TStrings;
  13.   destroyWithoutFree: TMethod;
  14. begin
  15.   sl := TStringList.InitInstance(@Buffer[0]) as TStrings; // using polymorphism of classes
  16.   sl.Create;
  17.   sl.Add('Foo');
  18.   sl.Add('Bar');
  19.   WriteLn(sl.Text);
  20.   destroyWithoutFree.Code:=@TStringList.Destroy;
  21.   destroyWithoutFree.Data:=sl;
  22.   TDestroyMethod(destroyWithoutFree)();
  23.   sl.CleanupInstance;
  24.   ReadLn;
  25. end.

If the FPC would support alloca (https://linux.die.net/man/3/alloca) or VLAs it would be much less hacky, but at least it does work.

So in theory, neither classes are restricted to the heap nor are objects restricted to the stack. Inheritance works in all cases as expected, but to make use of it, you need to access it via pointers. Where it is stored does not matter
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 09:27:38 pm
1) Object support fully inheritance
2) Object is placed fully on stack and thus dont need any alloc/dealloc
3) Object CANNOT overload operators
4) Object CANNOT  use Generics
5) Object CANNOT use Management operators
6) Object CANNOT use Polymorphism

1) No, objects can not make use of interfaces, and as they don't support multiple inheritance, there is not really something comparable. Simple inheritance is fully supported, but not multiple inheritance in any form.
2) You can place objects anywhere easiely. You can also do the same with classes, but it gets really hacky.
If you place it on the stack, no alloc or dealloc is required, but you need to call the constructor and destructor manually. This is something records can do automatically with management operators and classes do implicetly during the allocation
3) You can overload operators but you can't overload "class operators". This is a heavy drawback when working with generics. Example:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4. {$ModeSwitch advancedrecords}
  5.  
  6. type
  7.  
  8.   { TRec }
  9.  
  10.   generic TRec<T> = record
  11.   public type
  12.     TSpecializedRec = specialize TRec<T>;
  13.   public
  14.     Value: T;
  15.  
  16.     class operator +(constref A: TSpecializedRec; constref B: TSpecializedRec): TSpecializedRec;
  17.   end;
  18.  
  19. { TRec }
  20.  
  21. class operator TRec.+(constref A: TSpecializedRec; constref B: TSpecializedRec
  22.   ): TSpecializedRec;
  23. begin
  24.   Result.Value := A.Value + B.Value;
  25. end;
  26. var
  27.   a, b, c: specialize TRec<Integer>;
  28. begin
  29.   a.Value:=42;
  30.   b.Value:=32;
  31.   c := a + b;
  32.   Writeln(c.Value);
  33.   Readln
  34. end.
Simply not possible using objects, because you need to define your operators outside the class and therefore can't access the specialization. (Maybe possible with generic functions, but I use fpc 3.2, where the support is not there yet)
4) You can use generics, it's just a problem with operators.
5) Yes thats true
6) You can use polymorphism as you have seen in my example, you just need to use pointers explicetly. Classes do the exact same thing, they just hide it behind syntactic sugar
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 09:32:30 pm
Yes, but then (your code example) you allocate the memory on the HEAP.

Then please get the following to work (without heap)
Code: Pascal  [Select][+][-]
  1.     object TBase  // SizeOf = 8  +vmt if present
  2.       a, b: longint;  
  3.       // any methods you like
  4.     end;
  5.  
  6.     object TAdvanced(TBase) sizeof = 24  +vmt if present
  7.       e1,e2,e3,e4: Int64
  8.     end;
  9.      
  10.     Function Bar: TAdvanced;
  11.     begin
  12.       // result should be a TAdvanced or pointer to it => it needs 24 bytes (+vmt)
  13.     end;
  14.      
  15.     procedure Foo;
  16.     var base: TBase; // or pointer
  17.     begin
  18.       base := Bar();
  19.       // enter some deep recursion code, to use the stack
  20.       // access base
  21.     end;
  22.      

Now, if you allocate space on the stack (by having a local var) in Foo => then you only know the size of TBase (not enough space to hold TAdvanced, even if you pass a pointer)

But if you allocate space on the stack (by having a local var) in Bar => then you return a pointer into Bar's stackframe => and once you return from Bar that stackframe is fair game for being used by the next subroutine call.

So how to do it? (No heap / stack only / as many pointers as you wish)

-------
NOTE
TAdvanced and Bar could be defined in another unit. Even 3rd party package. It could be in the implementation (if it is returned as the baseclass, but with an instance of TAdvanced). You do not know their size.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 10:23:42 pm
Then please get the following to work (without heap)
I clearly stated:
Inheritance works in all cases as expected, but to make use of it, you need to access it via pointers.
But in your example you do:
Code: Pascal  [Select][+][-]
  1. base := Bar();
Which, except that it is not valid pascal at all, is copy by value, not access by pointer i.e. the thing I said you can not do.

The same semantics can be expressed using classes:
Code: Pascal  [Select][+][-]
  1. TBase = class
  2.   A: Integer;
  3. end;
  4. TChild = class(TBase)
  5.   B: Integer;
  6. end;
  7.  
  8. var
  9.   b: TBase;
  10.   c: TChild;
  11. begin
  12.   b := TBase.Create;
  13.   c := TChild.Create;
  14.   Move(PByte(c)[0], PByte(b)[0], c.InstanceSize); // copy by value
  15. end;
This fails in exact the same way. It is just that a class variable is a (hidden) pointer. So doing copy by value mechanics is not easiely possible (even though I just demonstrated it is possible if you really want to). But yes, copy by value fails, but this is not unique to objects. It is just that this is something that is much easier to do with objects.

This is simply the case that classes can do less than objects, and therefore objects can fail with simple mechanics where classes can't.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 10:34:41 pm
Despite that "object" by nature is copy by value....

I said you can use pointers. So you can return a pointer, and access that pointer.

For all else, you missed the entire point of my question:
=> Return an old style object (by value or pointer) from a function.
=> Assign it to a variable in the calling proc. (Again that can be value or pointer)
=> in such way that the called function could return any subclass
     and the calling function only knows the base class.
      (I.e. the subclass may be in a 3rd party unit, and may even be further subclassed inside the implementation part of that unit)
=> Also make it so, that the subclass may have any amount of extra fields. Having a greater "sizeof"
     (the extra data should not be lost, after the object (or pointer to it)  was returned)

All data must be on the stack (no heap). Therefore all pointers point to a location on the stack.
That is important, since the main benefit was that "objects" do not need to be explicitly freed (as was mentioned).

And just in advance, this is about objects. It does not matter if there are means to get around the need to explicitly free classes. You said you can do it with objects. Please let me know how.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 10:50:43 pm
Despite that "object" by nature is copy by value....
What should this even mean? You can copy objects by reference as well as by value. It's up to the programmer to decide which of these two are required in what situation, and use different syntacitcal constructs. If you copy where you should have referenced you might loose data and waste performance, if you reference where you should have copied, you might get a dangling pointer.
Being used wrong is not the fault of the datatype being "naturally" a certain way but of the programmer

Quote
All data must be on the stack (no heap). Therefore all pointers point to a location on the stack.
That is important, since the main benefit was that "objects" do not need to be explicitly freed (as was mentioned).

And just in advance, this is about objects. It does not matter if there are means to get around the need to explicitly free classes. You said you can do it with objects. Please let me know how.
What is your point? Because what you are saying has nothing to do with inheritance at all. This is about lifetime of different data storages. Sure this is a nice discussion to be had, but what does this have to do with inheritance?

I mean I can do the following:
Code: Pascal  [Select][+][-]
  1. type
  2.   PBase = ^TBase;
  3.   TBase = object
  4.     A: Integer;
  5.   end;
  6.                  
  7.   PChild = ^TChild;
  8.   TChild = object(TBase)
  9.     B: Integer;
  10.   end;
  11.  
  12.   function Child: PChild;
  13.   const c: TChild =();
  14.   begin
  15.     Result := @c;
  16.   end;
  17.  
  18. var
  19.   b: PBase;
  20. begin
  21.   b := Child;
  22. end.
By using a global variable. If you want to have an object be stored permanently, you need to use permanent storage. The stack is not permanent storage and the objects get invalidated when a function returns. So if you want to return a pointer, it can not point to the stack. What does this have to do with inheritance?

Whats your point? You can use inheritance as long as you can use pointers. And you can use a pointer as long as it's memory location it points to is valid.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 23, 2020, 11:14:03 pm
What is your point? Because what you are saying has nothing to do with inheritance at all. This is about lifetime of different data storages. Sure this is a nice discussion to be had, but what does this have to do with inheritance?
See below.
It is all about the restriction to the use of polymorphic objects that I pointed out earlier

Quote
By using a global variable. If you want to have an object be stored permanently, you need to use permanent storage. The stack is not permanent storage and the objects get invalidated when a function returns. So if you want to return a pointer, it can not point to the stack. What does this have to do with inheritance?
Meaning there can only be one instance at a time. Even if the caller wants more than one. The caller could have several local vars (pointer or otherwise) to hold results from more than one call.

Quote
Whats your point? You can use inheritance as long as you can use pointers. And you can use a pointer as long as it's memory location it points to is valid.

The question is not about returning a pointer. That was your suggestion to solve the limitation in old style objects.

The point is, that I said: Objects support polymorphism. But I also said this does not work in certain cases, when passing to/from function.
- You said, that this could be solved with pointers.
- I asked about a specific scenario where I do not believe pointers to solve the restriction.

Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 23, 2020, 11:45:32 pm
The point is, that I said: Objects support polymorphism. But I also said this does not work in certain cases, when passing to/from function.
- You said, that this could be solved with pointers.
- I asked about a specific scenario where I do not believe pointers to solve the restriction.
But you are mixing up two completely different concepts. Memory lifetime and polymorphism. To do polymorphism you need to access the memory, and to do so it must be live. This is not a restriction to objects. You can't use a class after freeing it. You can't use a pointer to an integer after freeing it's memory.
The "problem" you pointed out was: how to reference an object located on a stack frame after the function that frame belongs to has returned. The answer to that is you can't. This is like if you would be asking how to use a class after calling Free.

If you want to use an object, polymorphic or not, the memory it is stored on must be life. You pretent like this is a limitation of old style objects but the fact that the memory must exist if you want to access it is a restriction for *ANY* data.

Quote
Meaning there can only be one instance at a time. Even if the caller wants more than one. The caller could have several local vars (pointer or otherwise) to hold results from more than one call.
I could use an array. Sure this will at some time also run out, but you can also get out of memory on the heap. The thing I want to address is, you can store objects in any data you want (as you can with classes, as shown above). The ability to do polymorphism is only restricted by your ability to access the data. As long as you store the data in a way it is accessible you can do polymorphism.
I could also use a stack allocator that allocates the memory on the previous stack frame and passes the pointer to that to the callee. Another option would be to map a file into memory and save the data in the filesystem. I could just override previous stack frames and disregard anything on there. There are pretty much endless possibilities to make this work, and none of them have to do aynthing with polymorphism


And whats your obsession with not using the heap? The good thing about objects is that you can either use the heap, stack or the data segment if you want to, without having to resort to hacks like with classes. If you say that old school objects have a limitation if you limit them only to be used on the stack, this limitation is arbitrarily set by you and not by the datatype.
Please, show me how you solve your example when using classes without using the heap. You will see the exact same limitations apply

Nobody argues that objects are great because they live on the stack. They are great because the can live on the heap and stack. If you limit yourself to only one memory location, thats your limitation and yours alone. This has nothing to do with objects

Quote
The question is not about returning a pointer. That was your suggestion to solve the limitation in old style objects.
I just assumed that you would understand that to access memory, that memory must exist. As long as the memory exists, the ability to do polymorphism is not restricted by where the data is located
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 23, 2020, 11:51:12 pm
The point is, that I said: Objects support polymorphism. But I also said this does not work in certain cases, when passing to/from function.
- You said, that this could be solved with pointers.
- I asked about a specific scenario where I do not believe pointers to solve the restriction.
But you are mixing up two completely different concepts. Memory lifetime and polymorphism. To do polymorphism you need to access the memory, and to do so it must be live. This is not a restriction to objects. You can't use a class after freeing it. You can't use a pointer to an integer after freeing it's memory.

But you don't have to use a pointer to use an integer. While for stack objects to access polymorphism, you have to statically instantiate the object (no meta classes,  nothing).

So while what you say is technically through, it requires object creation and processing and smells as a workaround.


Note again that this is all academical, since changing the fundaments of the object model is something for a forked/derived language, not a 25 year old production compiler.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 24, 2020, 12:03:41 am
But you don't have to use a pointer to use an integer.
It depends what you want to do, for example if I want to access the bytes of the integer individually, I need a pointer to do so:
Code: Pascal  [Select][+][-]
  1. b: Byte;
  2. ...
  3. b := PByte(GetIntPtrFunction())[2];
and if you now take that pointer from a stack frame that is not live anymore, it is not going to work.
This is the exact same problem without any use of polymorphic inheritance.

The thing I want to point out is that data lifetime and polymorphism are completely independent. Sure to do polymorphism the data must be live, but it is not a limitation of polymorphism if you choose a data storage that has a lifetime thats not long enough.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 24, 2020, 01:10:21 am
After writing the response some of the important facts only came into light at the end.

I will put a copy of them upfront. All my replies are to be read with the following in mind.

- Yes old objects have polymorphism
- But its usability is limited as to where/when it can be used, if only stack memory should be used
- the stack limitation (and consequences) are an arbitrary condition.
  (This last point was not previously spelled out / never mind the limit was stated as requirement)

The limit was NOT created by me, but taken from previous posts.

All of the above where known.
With all of the above known:
- I stated: there are limits.... (not to polymorphism, but to where/when it can be used)
- You stated: pointers can overcome that limit

Then I challenged you.


Further more, I like and use objects. They are useful, versatile and good. All that is not the point. Neither is what else you can do with them. There was a specific statement about one thing that does not work with them. (and you response to it). THis discussion is about that very particular use-case only.
This use case was born out of a particular set of needs indicated in posts by another person on the thread.


But you are mixing up two completely different concepts. Memory lifetime and polymorphism. To do polymorphism you need to access the memory, and to do so it must be live. This is not a restriction to objects. You can't use a class after freeing it. You can't use a pointer to an integer after freeing it's memory.
The "problem" you pointed out was: how to reference an object located on a stack frame after the function that frame belongs to has returned. The answer to that is you can't. This is like if you would be asking how to use a class after calling Free.
That is my opening statement. What I said to begin with (and repeated).
- Yes it has polymorphism
- But its usability is limited

One of your posts (in reaction to this statement of mine) was that I was wrong, and it would work with pointers.

I am aware of the lifetime of memory pointed to. Using pointers had never been my idea (you even "scolded" me when my example for the issue was NOT using pointers).
So I really do not get, why you keep telling me, that I mix things, when you were the one who did bring in pointers.
And I do not get why you doubt my understanding of pointers, when I said (as reaction to your claim) that pointers would not solve it entirely, and gave you a task on which I believed this would be the case.

As you now confirm:
Quote
The answer to that is you can't.

I always thought so.

Again, yes it still has polymorphism. But again, there are limitations that restrict where/when/how you can use it.

Quote
If you want to use an object, polymorphic or not, the memory it is stored on must be life. You pretent like this is a limitation of old style objects but the fact that the memory must exist if you want to access it is a restriction for *ANY* data.
Yes, true.
But new style classes make it possible to keep the memory available in the given case.
Old style objects do not allow for that.

The limitation is NOT that the memory must be alive.
The limitation is that for old style objects (when using certain function calls) it is not possible to keep it alive.


Quote
The ability to do polymorphism is only restricted by your ability to access the data. As long as you store the data in a way it is accessible you can do polymorphism.
But this inability stems from the way objects are designed [1]. It is part of old style objects.
New style classed to not impose that limit on such abilities.

[1] Under the premise, that no heap is to be used. Which was a given on the original task, and known before you implied that pointers could overcome that limitations.

Quote
I could also use a stack allocator that allocates the memory on the previous stack frame and passes the pointer to that to the callee.
That one I do give to you. That is true and that will work.
And it will fulfil the condition that memory is freed when the frame is exited (the proc returns).

Mind you though: "previous stackframe" => The called procedure must find the correct caller ("previous" => could be several frames up), and move the data of all stackframes in between, and adjust the basepointers for all of them.
It will be a noticeable overhead (but that is fine, that does not violate the conditions of the exercise)
 
I am not sure, how easy that can be done in FPC, without changes to the compiler? It might be possible, but it will probably relay on implementation detail of the compiler, and therefore run the risk of breaking with updates.
It also requires that stackframes are generated for all relevant callers.

It may not work, if the call chain has nested procedures. They get passed the parent basepointer, and may keep a local copy of it. You will probably not be able to adjust that, and it would then be invalid.

Quote
Another option would be to map a file into memory and save the data in the filesystem.
That is really just another way to say "heap" (swap file). Though yes, hairs can be split on this one.

Quote
And whats your obsession with not using the heap?
I have none. But the entire premise that lead to this discussion was that they should be on the stack. So no memory alloc, and more important no de-alloc was necessary.
Data on the stack is freed when the procedure exits. (Yes interfaces can free heap. But that was not the original question either).

At the time where you claimed that "pointers" would lift the restrictions on how the (fully existing) polymorphism could be used, all of those conditions were known.

Putting old style objects on the heap, means a manual call to de-alloc is needed.

Quote
If you say that old school objects have a limitation if you limit them only to be used on the stack, this limitation is arbitrarily set
Yes it is.
But it was set so, before you claimed that pointers could lift that arbitrarily set limitation.

Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 24, 2020, 03:29:49 pm
But new style classes make it possible to keep the memory available in the given case.
Old style objects do not allow for that.
But this is completely and utterly wrong. You can put old style objects on the heap, the same way new style classes are on the heap. The ONLY difference is that classes hide this from the user with syntactic sugar. Semantically there is literally no difference from putting an object on the heap and creating a class.

If like you say objects don't allow to be allocated on the heap, better call the object police, because then I have broken that law many times.

Why do you limit objects to the stack, but happiely ignore that limitation for classes to then argue classes are less limited than objects, even though the limitation was specifically brought forward by you to only apply to objects and ignore it for classes.

Let's make a similar challange, we make a race with two cars, you get a ferrari and I get an opel corsa. Who makes it first on a race track wins. The only caveat is that you in your ferrari can't use gasoline. See how inferior the ferrari is to the opel corsa, it can't even beat it in a simple race

Quote
There was a specific statement about one thing that does not work with them. (and you response to it).
But your challange has nothing to do with objects. You challange was to make a dynamic allocation of memory without using dynamic memory allocation (i.e. the heap).


If this is an inherent limitation to objects that does not apply to classes, I've shown previously how to allocate classes on the stack. Use this to solve your problem with classes instead of objects under the same conditions (i.e. not using the heap). If you can, I agree this is a limitation of objects. If not, this has nothing to do with the datatype used.
You claim this is an inherent design problem with objects, so it should be a cakewalk to solve this with classes then... right?


Quote
But it was set so, before you claimed that pointers could lift that arbitrarily set limitation.
No I didn't I said you can use polymorphism if you use pointers. I never claimed that you can access memory after it been freed. Quote me where I claimed that you can lift that limitation by using pointers.

But I can state what I meant again. If you have your object or class stored anywhere and got a pointer to it, you can use polymorphism. The location it is stored in and the mechanism of how this storage was acquired is completely irrelevant for the ability to use polymorphism.

I can excatly quote what I claimed:
Quote
So in theory, neither classes are restricted to the heap nor are objects restricted to the stack. Inheritance works in all cases as expected, but to make use of it, you need to access it via pointers. Where it is stored does not matter
How does the statement "Where it is stored does not matter" imply that it works even if there is no storage at all. You claim that I claimed that polymorphism should work in your example when using pointers, i.e. in a situation where the object was already destroyed. How did you read that from this statement? IMHO the "Where it is stored" implies that it is stored somewhere.
But your example creates a situation where the object storage was destroyed before it should be accessed polymorphically.
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 24, 2020, 08:41:01 pm
1) Object support fully inheritance

TP-style objects support inheritance in that they can inherit from a parent object.

2) Object is placed fully on stack and thus dont need any alloc/dealloc

If you use them directly (e.g. a TMyObject) then it's fully located where you declared (e.g. global variable or stack). You can however instantiate them using New as well, thus they'd reside on the heap.

3) Object CANNOT overload operators

You can use global operator overloads, however you can't use member operator overloads which are necessary to support custom operators inside generics.

4) Object CANNOT  use Generics

Yes, they can.

5) Object CANNOT use Management operators

Correct.

6) Object CANNOT use Polymorphism

You can access an object instance using a variable of the parent type.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 24, 2020, 10:55:42 pm
But new style classes make it possible to keep the memory available in the given case.
Old style objects do not allow for that.
But this is completely and utterly wrong.

If you take my statements out of the context in which I gave them, then yes the remaining partial statement will be wrong.
Please re-read my post, and make sure to apply the entire context that I stated.

And by applying it, limit your response to that what is relevant to that context.

When there is a discussion about what can be done with objects if they should only be used on the stack => how does it matter that you can also use them on the heap?
It was clearly said, that this was not wanted (no matter why / despite the why was actually explained too)

Thanks
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 25, 2020, 12:13:34 am
If you take my statements out of the context in which I gave them, then yes the remaining partial statement will be wrong.
Please re-read my post, and make sure to apply the entire context that I stated.
And you seem to not have read my post, because like my last 3 post are soly about that you claim that this is some limitation of inheritance in objects. I don't say that this limitation does not exist, my complete argument revolves around that this has literally nothing to do with objects. Classes have the exact same problem if you allocate them on the stack.

You did now multiple times claim that this is inherently a limiation of objects. Let me go through all of your claims:
Quote
But new style classes make it possible to keep the memory available in the given case.
Old style objects do not allow for that.
This statement clearly implies that objects have a limitation that classes don't. But if you restrict classes to the stack, this is simply not true

Quote
But this inability stems from the way objects are designed [1]. It is part of old style objects.
New style classed to not impose that limit on such abilities.

[1] Under the premise, that no heap is to be used. Which was a given on the original task, and known before you implied that pointers could overcome that limitations.
This is a complete farce. You say that objects are limited by design while classes don't impose a limit, while acknowledging that the limit is arbitrary set by you and only applied to objects. If we apply the same premise for classes, they are as limited as objects

Quote
That was your suggestion to solve the limitation in old style objects.
This statement again implies that this is a limitation inherent to objects, which as I stated multiple times is simply not true. Classes if you restrict them to not use the heap, face the exact same limitations

So you claimed 3 times clearly that there is some inherent limitation of the design of objects, in two of which you clearly mention classes as an alternative that do not have this limitation. The only thing I am claiming, the whole time, is that this limitation has absolutely nothing to do with the design of objects, as it is also present in classes if you compare them under the same set of limitations.

This is literally all I am arguing against. You make repeatedly the claim that objects are inherently limited in a way classes are not. But this is just factually wrong. You are comparing two different things here, objects which are placed on the stack, vs classes placed on the heap. This has nothing to do with the advantages and disadvantages of classes vs objects, but all with the advantages and disadvantages of heap vs stack.

Again with my car example from above, the fact that an opel corsa with gasoline is faster than a ferrari without gasoline does not imply in any way shape or form that the ferrary is by design slower than the opel corsa, but can be solkey explained by the usage of gasoline to power the car.

Similarly, your argument is a comparison between classes on the heap and objects on the stack. The differences between them are not representative of the differences in design between objects and classes, as you have not compared them equally. The argument you are actually bringing forth is about the advantages and disadvantages of the different lifetime and managment of different memory allocation methods. And this is a perfectly fine argument to have, but you pretend like the design of objects has anything to do with it, while objects are literally designed to make use of both, the heap and the stack. If you really think that this is about the way objects are designed, compare apples with apples, compare stack objects vs stack classes or heap objects to heap classes. I repeat my challange:
Quote
Use this to solve your problem with classes instead of objects under the same conditions (i.e. not using the heap). If you can, I agree this is a limitation of objects. If not, this has nothing to do with the datatype used.
If you don't think that this is about the way objects are designed, but rather about the advantages and disadvantages of the different memory allocation methods (stack vs heap), we have absolutely nothing to disagree on. Because I fully agree, the automatic memory management of the stack is great for convinience, but does limit the usability of the data.

PS:
About this:
Quote
When there is a discussion about what can be done with objects if they should only be used on the stack
and
Quote
I have none. But the entire premise that lead to this discussion was that they should be on the stack. So no memory alloc, and more important no de-alloc was necessary
I never was part of that discussion (that they should be on the stack), I just claimed that polymorphism can be fully used if the object lives on the stack. I did so with this example:
Code: Pascal  [Select][+][-]
  1.     procedure PrintObj(x: PBase);
  2.     begin
  3.       WriteLn(x.Print);
  4.     end;
  5.      
  6.     var
  7.       c: TChild;
  8.     begin
  9.       c.init(42, 32);
  10.       printObj(c);
  11.       c.Destroy;
  12.     end;
As you can see, polymorphism is used by calling an virtual overriden method from a base object pointer pointing to the child object located on the stack.

Your ability to use polymorphic properties of an object is only limited by the livelyness of an object. It does not matter if the object is allocated on the stack or on the heap.


I never made a claim if objects should only be used on the stack. Honestly you should always use what fits your situation best, this is true for data types and for their allocation methods. I just said, and I am repeating myself now pretty often: As long as you have a pointer to an object (which of course must exists in memory), you can use all the features of polymorphism. I never claimed anything about if you should use them, if you should place them only on the stack or whatever. I never took part in that discussion and I never argued for any position in that discussion.

The thing is, the question if objects should be placed on the stack is a completely different discussion as of what they are capable of if they are placed on the stack. There is no limitations objects have when placed on the stack that is inherent  to the design of objects. To the contrary, all the features of an object can be used regardless of the location address starts with $000000 (heap grows from above) or $FFFFF (stack grows from below).

How about an experiment where we controll for all other variables except for the allocation method.
Code: Pascal  [Select][+][-]
  1. type
  2.   PBase = ^TBase ;
  3.   TBase = object
  4.      // CHANGEME
  5.   end;
  6.   PChild = ^TChild;
  7.   TChild = object(TBase)
  8.     // CHANGEME
  9.   end;
  10.  
  11.   procedure DoSomething(obj: PBase);
  12.   begin
  13.     // CHANGEME
  14.   end;
  15.  
  16. var
  17.   c: TChild;
  18.   pc: PChild;
  19. begin
  20.   DoSomething(@c); // a
  21.   new(pc);
  22.   DoSomething(pc); // b
  23.   dispose(pc);
  24. end;
Fill out the CHANGEME in a way that a fails but b does not fail, simply for the way the memory was allocated.

If there is a feature of objects that is only possible if the object was allocated on the heap, but not if the object was allocated on the stack, this controlled experiment should be able to show this.
Title: Re: How optimized is the FPC compiler
Post by: ASBzone on December 25, 2020, 01:17:04 am

My observation over the last few years (earlier I don't know) is:
A lot of time is spent on level 3+ optimization. But the real question is how many actually use it.
Personally, in order to minimize the number of bugs (which is my top priority),
I don't dare to use more than level 2 and I'm happy with that approach.


I used level 3 for a couple years, but now I only use level 4, and it has not been a problem for me.


My applications are admittedly small, however.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 25, 2020, 02:15:52 am
On the "by design" => read bottom part first. Might save some arguments on the middle part.

If you take my statements out of the context in which I gave them, then yes the remaining partial statement will be wrong.
Please re-read my post, and make sure to apply the entire context that I stated.
And you seem to not have read my post, because like my last 3 post are soly about that you claim that this is some limitation of inheritance in objects.
Really? Just my previous response:
- Yes old objects have polymorphism
- But its usability is limited as to where/when it can be used, if only stack memory should be used
- the stack limitation (and consequences) are an arbitrary condition.
  (This last point was not previously spelled out / never mind the limit was stated as requirement)

The limit was NOT created by me, but taken from previous posts.

It clearly states it is not "some limitation of inheritance in objects" (as you claim I said).

I said "if only stack memory should be used"
It is a limitation that stems from the combination of "object" + "stack only".

And yes, I read your point that then it would apply to classes too, if they actually could be limited to stack only. But that is not the point.

The stack only limit was brought up in some other post, because it meant that no (absolutely no) effort would have to be made to release the memory. (not even some interface, not even managed operators).
Stack memory is freed when the procedure is left.

Back to classes => The "no effort to free memory" effect can not be reached with classes (interfaces and managed operators are an effort). Hence the desired effect can not be reached with classes.

Objects can (by choice) live on stack only.
But if that choice is made, then (and only then) the limits I stated do apply.

Since that choice was made, it does not matter in any way what ever you can do if you use heap (be that: objects + heap or classes). Really does not matter.

All I ever said was, that if you make the "stack only" choice, then certain limits apply.

Quote
I don't say that this limitation does not exist, my complete argument revolves around that this has literally nothing to do with objects.
Classes have the exact same problem if you allocate them on the stack.
But classes do not go on the stack. At least not by design. Not in FPC.

If classes were designed to allow you to use them "stack only" then classes would have the same issue. But classes are not designed that way.

On the other hand: Objects are designed to give you the choice to use them on the stack.
Yes of course there design allows other usages too. It allows a choice. And for reasons given, the particular "stack only" usage was chosen.

Quote
You did now multiple times claim that this is inherently a limiation of objects. Let me go through all of your claims:
Quote
But new style classes make it possible to keep the memory available in the given case.
Old style objects do not allow for that.
This statement clearly implies that objects have a limitation that classes don't. But if you restrict classes to the stack, this is simply not true
Only if you missed that it was previously mentioned that it was about the choice to have objects on the stack.

But what im really curious at, is , what are then "object" classes like, are they not supposed to be C++ art-classes, where they can live on the stack and still have polymophism, inheritance etc... available?
There may have been more references, I did not search to exhaustion.

And just to be clear. I did not say that objects did not have polymorphism in that case. I said that in that case certain use cases would not work. (I.e. using as function result, where a child class is returned that needs extra memory). It is about the combination.

Quote
This is a complete farce. You say that objects are limited by design while classes don't impose a limit, while acknowledging that the limit is arbitrary set by you and only applied to objects. If we apply the same premise for classes, they are as limited as objects
Because by design, it cannot be applied to classes.
If it could ... well yes... but it cannot.

... skipping some repetition of the same point.

Quote
So you claimed 3 times clearly that there is some inherent limitation of the design of objects, in two of which you clearly mention classes as an alternative that do not have this limitation. The only thing I am claiming, the whole time, is that this limitation has absolutely nothing to do with the design of objects, as it is also present in classes if you compare them under the same set of limitations.
Yes and No, depending on the exact interpretation of your above statement.

- Objects - by design - give you the choice between heap and stack (and mixing those)
- Classes - by design - do not give you that choice (if they did, it where different, but they do not).

If you take that choice (e.g. for the reasons given earlier), then objects will encounter problems, when and if (and only when and if) used as described.

Classes would, if that choice was available for them. But it is not. Therefore - by that design - you can not make classes suffer from the issue.

Quote
You are comparing two different things here, objects which are placed on the stack, vs classes placed on the heap. This has nothing to do with the advantages and disadvantages of classes vs objects, but all with the advantages and disadvantages of heap vs stack.
Yes it as heap vs stack

No, I am not making that comparison => Well I did not start it. I only talked about objects, because the entire idea was to find something that worked on the stack only. Classes do not do that. So I did not even bother to bring classes in.

After you forced classes into the discussion of a "stack only" issue (a discussion where classes have no place to be), I replied that classes do not have that problem.
That reply of mine can indeed be misleading (sorry about that).
- Classes do not have the problem, because classes can not be used in that case at all.

To use "cars".
Had we been talking about problem boats may have on the open ocean (e.g. needing compensation for drift when navigating), I may have said (when prompted) that cars do not have those problems. I would have meant cars can not be used on the open ocean, therefore they also are not affected by drift occurring on the open ocean. (But sure, if you had a floating car.....)


Quote
The argument you are actually bringing forth is about the advantages and disadvantages of the different lifetime and managment of different memory allocation methods. And this is a perfectly fine argument to have, but you pretend like the design of objects has anything to do with it, while objects are literally designed to make use of both, the heap and the stack.
It is not about how polymorphism is designed in either of them.

It is about that only one of them can by design be used for "stack only".

See section above this quote.

Quote

I never was part of that discussion (that they should be on the stack), I just claimed that polymorphism can be fully used if the object lives on the stack. I did so with this example:
Code: Pascal  [Select][+][-]
  1.    
As you can see, polymorphism is used by calling an virtual overriden method from a base object pointer pointing to the child object located on the stack.
In the usecase of your example indeed it works.

Question here is "can be fully used", did you mean:
- Can sometimes be fully used
- Can always be fully used
?
I assumed you meant the latter. (Sorry if I misunderstood)

- If you meant the former, then the discussion is void, as the former would actually mean that "stack only" implies limitations.
- If you meant the latter, then well I asked you for an example of the usecase I provided. (And that example is still outstanding)

Quote
There is no limitations objects have when placed on the stack that is inherent  to the design of objects.

Ok, going back I did use "the way objects are designed.  Under the premise, that no heap is to be used."
And I may have refereed to the design of objects without explicitly stating the 2nd part, assuming it was known already.

I did so because "no heap" would most usually be done by declaring a local variable to hold the data.
In this case, under the given premise, memory allocation is not done by the user. It is part of the overall experience of using on object on the stack.
I therefore attributed the memory allocation (when not done explicitly by the user) to be part of the object design.

I.e. that is to read:
By design an object can automatically take space on the stack frame, if declared as local (none pointer) variable.
By design an object allows other types of mem allocation, if the user explicitly allocates the memory.

At first, one might thing that is not something bound to objects, as it applies to all datatypes. But that is not true. Ansistring and dyn array do not offer storage on the stack.



Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 25, 2020, 02:33:11 am
Btw, on the whole "by design of whatever" => that really was not the point of what this started with.

The point (even if design was mentioned) was, that
- when using objects
- and when limiting to stack only
limitations do exist.

As to if they are caused by "some design" or "something else" => not the point.

As to if they only exists when doing so with objects, or if they would also exist when using classes (under the assumption that classes could actually be used stack only) => not the point.

They do exist for the case described. What happens outside that described case => not the point.


Would be nice, to take all the "not the points" of the discussion, and get back to what the actual essence of my statement was.

My original post on that issue (follow the link  for full version)

But what im really curious at, is , what are then "object" classes like, are they not supposed to be C++ art-classes, where they can live on the stack and still have polymophism, inheritance etc... available?

They are a "limited attempt"....

Just as records, they do not allocate heap memory, but rather live on the stack. And while they have some support for inheritance, that requires that the subclasses have no additional fields. (You can afaik do virtual / overwritten stuff)

Imagine
Code: Pascal  [Select][+][-]
  1.  

b only has space for the fields of TBase.
So in the above example the remainder gets cut off. (Even if you cast back to TAdvanced, it will not come back. It is lost forever / It may even crash if you try to access extra).
B.method() would still call TAdvanced.method, if that was virtual/overwritten.

That is why new classes are on the heap => The caller does not need to know how much extra mem may have been required.

This is referring to the desired use case of having them on the stack. Allocating them to the heap is possible, but not part of this discussion.

As for "they do not allocate heap memory": They do not on their own. The user can do that for them. But not point of that post...

Using pointers to stack location does as far as I can see not change the above.


Your direct reply was "That is not true ..." But you missed the "stack only" and gave a "with heap" example.
This is not true, with regards to the "basic" inheritance (i.e. no interfaces), objects have the same capabilities as classes. Classes just add a coat of syntactic sugar to it:

Some posts later:
Yes, but then (your code example) you allocate the memory on the HEAP.
The type of allocation does not matter, what matters is that you access via pointers.
To which you gave an example to a case in which this happens to work.

I then ask for how to solve the case of returning an inherited (larger) object.
Using stack only.
Pointer (to stack) as you like.
(Note that when I posted it, I did set the quote markers wrong... / Below code is shortened, follow link for full)
Yes, but then (your code example) you allocate the memory on the HEAP.

Then please get the following to work (without heap)
Code: Pascal  [Select][+][-]
  1.  

Now, if you allocate space on the stack (by having a local var) in Foo => then you only know the size of TBase (not enough space to hold TAdvanced, even if you pass a pointer)

But if you allocate space on the stack (by having a local var) in Bar => then you return a pointer into Bar's stackframe => and once you return from Bar that stackframe is fair game for being used by the next subroutine call.

So how to do it? (No heap / stack only / as many pointers as you wish)

-------
NOTE
TAdvanced and Bar could be defined in another unit. Even 3rd party package. It could be in the implementation (if it is returned as the baseclass, but with an instance of TAdvanced). You do not know their size.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 25, 2020, 04:32:14 am
They do exist for the case described. What happens outside that described case => not the point.
Then tell me, how should I undertstand this:
Quote
But this inability stems from the way objects are designed [1]. It is part of old style objects.
New style classed to not impose that limit on such abilities.

[1] Under the premise, that no heap is to be used. Which was a given on the original task, and known before you implied that pointers could overcome that limitations.
Because just by reading it looks to me like you are directly comparing the design of old style objects with new style classes. But if you are talking about the specific case described, this comparison makes absolutely no sense, because new style classes are not comparable to that situation at all.

Are you comparing the two types or are you comparing the different circumstances (heap vs stack). Your claims like the one quoted clearly say the former, but all your explainations say the latter.

Would be nice, to take all the "not the points" of the discussion, and get back to what the actual essence of my statement was.

My original post on that issue (follow the link  for full version)

[...]

Your direct reply was "That is not true ..." But you missed the "stack only" and gave a "with heap" example.
This is not true, with regards to the "basic" inheritance (i.e. no interfaces), objects have the same capabilities as classes. Classes just add a coat of syntactic sugar to it:
Yes I misread your post. Thats why when you pointed that out, I gave you an example on how polymorphism still works if the data is allocated on the stack.

I then ask for how to solve the case of returning an inherited (larger) object.
Using stack only.
Pointer (to stack) as you like.
(Note that when I posted it, I did set the quote markers wrong... / Below code is shortened, follow link for full)
Yes, and my whole argument here is that this has nothing to do with polymorphism. Maybe I am wrong here, but for my understanding the goal of polymorphism through inheritance as employed in pascal is to allow to access different datatypes through a common interface. The key is the word *accessing* here. Your example is the copying of data, i.e. the storing of data. Polymorphism makes no attempt to unify the storage of data in memory of different objects. And of course it doesn't because there are other concepts for doing exactly this:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   sysutils;
  7.  
  8. Type
  9.  
  10.   { TBase }
  11.  
  12.   TBase = object
  13.   public    
  14.     A: Integer;
  15.     constructor init(AValue: Integer);
  16.     destructor destroy; virtual;
  17.     function ToString: String; virtual;
  18.   end;
  19.  
  20.   { TChild }
  21.  
  22.   TChild = object(TBase)
  23.   public      
  24.     B: Integer;
  25.     constructor init(AValue: Integer; BValue: Integer);
  26.     function ToString: String; virtual;
  27.   end;
  28.  
  29.   TUnion = record
  30.   case Boolean of
  31.   True: (base: TBase);
  32.   False: (Child: TChild);
  33.   end;
  34.  
  35. { TBase }
  36.  
  37. constructor TBase.init(AValue: Integer);
  38. begin
  39.   A := AValue;
  40. end;
  41.  
  42. destructor TBase.destroy;
  43. begin
  44.  
  45. end;
  46.  
  47. function TBase.ToString: String;
  48. begin
  49.   Result := 'A: ' + A.ToString;
  50. end;
  51.  
  52. { TChild }
  53.  
  54. constructor TChild.init(AValue: Integer; BValue: Integer);
  55. begin
  56.   inherited init(AValue);
  57.   B := BValue;
  58. end;
  59.  
  60. function TChild.ToString: String;
  61. begin
  62.   Result := inherited ToString + ' B: ' + B.ToString;
  63. end;
  64.  
  65. function Child: TUnion;
  66. begin
  67.   Result.Child.init(42, 32);
  68. end;
  69.  
  70. function Base: TUnion;
  71. begin
  72.   Result.base.init(42);
  73. end;
  74.  
  75. var
  76.   u: TUnion;
  77. begin
  78.   u := Child;
  79.   WriteLn(u.base.ToString);
  80.   u := Base;
  81.   WriteLn(u.base.ToString);
  82.   ReadLn;
  83. end.
Using variant records one can unionize the storage for different objects, and due to the common prefix of  inherited objects this can be used to make use of polymorphism while simultaniously having a unified storage.

This btw also solves your "challange", but thats irrelevant, because this challange is not about polymorphism, it is about the underlying memory model. That’s why the solution to this problem has nothing to do with polymorphism or some object specific mechanisms but with variant records, a construct to manually map memory
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 12:28:52 pm

A small disclaimer: Im only speaking here what would be nice, im not judging why is that and why that not, only pointing out what would make objects great again :D And of course you decide in the end if it makes it into FPC and WHEN but i like alot in this forum atleast ppl are not avoiding discussions and improvements as if one would insult  their mother, good job on this, i like it alot :D

Title: Re: How optimized is the FPC compiler
Post by: Awkward on December 25, 2020, 12:54:10 pm
they cannot inherit from interfaces while records can[/li][/list]
Ough! Am i missing something? Records inherits from interfaces? give me example please!

How they can inherits if Record must be just ordering data set?
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 01:05:12 pm
@Awkward
upsi, u are right it doesnt work, i was still at C# by writing my post :P

i have also read this article why (for now atleast) it doesnt work:

Quote
Relevant to this question, there are two kinds of inheritance: interface inheritance and implementation inheritance.

Interface inheritance generally implies polymorphism. It means that if B is derived from A, then values of type B can be stored in locations of type A. This is problematic for value types (like records) as opposed to reference types, because of slicing. If B is bigger than A, then storing it in a location of type A will truncate the value - any fields that B added in its definition over and above those of A will be lost.

Implementation inheritance is less problematic from this perspective. If Delphi had record inheritance but only of the implementation, and not of the interface, things wouldn't be too bad. The only problem is that simply making a value of type A a field of type B does most of what you'd want out of implementation inheritance.

The other issue is virtual methods. Virtual method dispatch requires some kind of per-value tag to indicate the runtime type of the value, so that the correct overridden method can be discovered. But records don't have any place to store this type: the record's fields is all the fields it has. Objects (the old Turbo Pascal kind) can have virtual methods because they have a VMT: the first object in the hierarchy to define a virtual method implicitly adds a VMT to the end of the object definition, growing it. But Turbo Pascal objects have the same slicing issue described above, which makes them problematic. Virtual methods on value types effectively requires interface inheritance, which implies the slicing problem.

So in order to properly support record interface inheritance properly, we'd need some kind of solution to the slicing problem. Boxing would be one kind of solution, but it generally requires garbage collection to be usable, and it would introduce ambiguity into the language, where it may not be clear whether you're working with a value or a reference - a bit like Integer vs int in Java with autoboxing. At least in Java there are separate names for the boxed vs unboxed "kinds" of value types. Another way to do the boxing is like Google Go with its interfaces, which is a kind of interface inheritance without implementation inheritance, but requires the interfaces to be defined separately, and all interface locations are references. Value types (e.g. records) are boxed when referred to by an interface reference. And of course, Go also has garbage collection.

Could this something be (the boxing idea even without GC?) and IF yes, is there on the other side a "big" perdformance penalty to this, which then maybe would break the entire performance discussion here, when thry would be boxed, but actually they would be only boxed when need be, and maybe the  FPC could optimize this by storing a permanent pointer of the interface to the heap location so when using interface methods from records, it would just use this
Code: Pascal  [Select][+][-]
  1. record.InterfaceProc1
internally as
Code: Pascal  [Select][+][-]
  1. hiddenIntefaceInstance.InterfaceProc1
or smth along the lines..

But in gods name, how does C++ do it then, I know that structs are same as class in C++ only the visibility section is different, so when speaking analogically to C++, it would mean in this context, that a struct is inheriting from abstract-classes, and they live both as well on stack if instantiated, so there would be the same slicing issue, or not, how do they deal with that??
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 25, 2020, 01:08:28 pm
  • they cannot inherit from interfaces while records can

No, they don't.

  • they can ONLY inherit other objects (kind of big limitation, would be nice to allow them aswell also atleast inheritance from other records, im not a compiler dude but I think they could theoretically also inherit from classes, if not classes, but best would be inheritance from all)

No. They follow different lowlevel principles, they can't be mixed.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 01:11:52 pm
@PascalDragon
Pls read my post again (I have eddited cpl of times) how does C++ do all that within stack-area and get away with it??
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 25, 2020, 01:15:19 pm
I don't care what C++ does. We're talking about Object Pascal here. And in Object Pascal the internal, low level principles between objects and classes are simply different thus they can not be mixed.

Also if you put your objects on the stack in C++ you can't use polymorphism (e.g. declare a type as SomeType, but instantiate a SomeSubType that contains additional fields).
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 01:18:36 pm
Ahh okay okay, dont feel botherd pls, I m only wanting to get maximum understanding how fpc does it and what MAYBE can be improved, if not its fine thats what discussions are about, right?

So: Just for my understanding, why cannot object-types inherit then (advanced)record-types? they should kind of share the same memory-layout in Stack, dont they?
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 25, 2020, 01:27:24 pm
They do exist for the case described. What happens outside that described case => not the point.
Then tell me, how should I undertstand this:
Quote
But this inability stems from the way objects are designed [1]. It is part of old style objects.
New style classed to not impose that limit on such abilities.

[1] Under the premise, that no heap is to be used. Which was a given on the original task, and known before you implied that pointers could overcome that limitations.
Because just by reading it looks to me like you are directly comparing the design of old style objects with new style classes. But if you are talking about the specific case described, this comparison makes absolutely no sense, because new style classes are not comparable to that situation at all.

"New style classed to not impose that limit on such abilities."  => To be read as, new style classes force usage of the heap (this is part of the design of new style classes). Therefore they can not experience that limit.

Contrary, old style objects to not force usage on the heap. (Never mind that they could. It is not the point of the overall context that they could. It was said that they should be an the stack)

So I am "comparing"
- new style classes, particularly the fact that their design forces heap usage
versus
- Old style classes, with regards to their design does not force them to be on the heap. And that in this case are chosen to be on the stack

I then ask for how to solve the case of returning an inherited (larger) object.
Using stack only.
Pointer (to stack) as you like.
(Note that when I posted it, I did set the quote markers wrong... / Below code is shortened, follow link for full)
Yes, and my whole argument here is that this has nothing to do with polymorphism.
I think you mean that slightly different (but too complex for me to bother to put in into words).

Because actually it is (afaik) due to polymorphism that the inherited class can have a bigger mem footprint. And that is part of what causes the issue.

But it is not just polymorphism alone.
Then again, I don't think I ever claimed that.

As a general note "design of objects" is more broad than "design of polymorphism in objects".


Maybe I am wrong here, but for my understanding the goal of polymorphism through inheritance as employed in pascal is to allow to access different datatypes through a common interface. The key is the word *accessing* here. Your example is the copying of data, i.e. the storing of data. Polymorphism makes no attempt to unify the storage of data in memory of different objects. And of course it doesn't because there are other concepts for doing exactly this:
In essence, yes like that. But, for polymorphism to work (without restriction as the one starting this argument) the memory model chosen must then have certain properties.

In the given case those properties are not present. Objects allow to chose such an insufficient memory mgmt. (They even default to it). Permitting that choice is part of their design. (not their polymorphism design, but there overall design)

As stated "design of objects" is more broad than "design of polymorphism in objects".
And the (list of) mem mgmt models available is part of the overall design of objects.


Using variant records one can unionize the storage for different objects, and due to the common prefix of  inherited objects this can be used to make use of polymorphism while simultaniously having a unified storage.
But only if you know the size of the biggest possible derived object.
And that size can not be known in all cases (examples given early on in the discussion)
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 25, 2020, 03:06:58 pm
Ahh okay okay, dont feel botherd pls, I m only wanting to get maximum understanding how fpc does it and what MAYBE can be improved, if not its fine thats what discussions are about, right?

We - as in the FPC developers - don't see a need to improve anything here, cause it is fine as it is.

So: Just for my understanding, why cannot object-types inherit then (advanced)record-types? they should kind of share the same memory-layout in Stack, dont they?

Because records are not objects. The compiler handles them completely different.

In addition to that Pascal is about clarity. When I now need to find out whether my parent type is a record or object, what is clear about that?

No, thank you.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 25, 2020, 03:56:17 pm
"New style classed to not impose that limit on such abilities."  => To be read as, new style classes force usage of the heap (this is part of the design of new style classes). Therefore they can not experience that limit.

Contrary, old style objects to not force usage on the heap. (Never mind that they could. It is not the point of the overall context that they could. It was said that they should be an the stack)

So I am "comparing"
- new style classes, particularly the fact that their design forces heap usage
versus
- Old style classes, with regards to their design does not force them to be on the heap. And that in this case are chosen to be on the stack
Now I understand what you mean. You don't compare classes and objects, you compare heap allocation vs stack allocation and just use new style classes as standin for heap allocated instances and old style classes as standin for stack allocated instances.

So basically all you are saying boils down to the following: To allocate memory on the stack the exact type must be known at compile time (modulo inherited types of the same size) while on the heap the type can be deduced during compiletime. Therefore if you want to allocate memory for an object without knowing it's exact type during the writing of your function, stack allocation can not be used and one must resort to using the heap.

And as I already said multiple times, the I completely agree with argument between heap and stack. I just could not see why one would talk about classes and objects and use them as standinds for heap and stack.

I think you mean that slightly different (but too complex for me to bother to put in into words).

Because actually it is (afaik) due to polymorphism that the inherited class can have a bigger mem footprint. And that is part of what causes the issue.

But it is not just polymorphism alone.
Then again, I don't think I ever claimed that.

As a general note "design of objects" is more broad than "design of polymorphism in objects".
As I said, to me, polymorphism only describes the access not the allocation of objects.  Therefore I have never seen that as a polymorphism issue. As an example, polymorphism was used prior to OOP by having structs/records with a comon prefix. This is very often employed in the linux kernel, where structs are used and the pointer they are accessed by, only refers to a type covering the first few fields while the rest was ommited and only used internally. These datatstructures of course always need to be passed by reference as their size is unknown to the outside.
So this whole issue is not an issue of polymorphism, because polymorphism, at least to my understanding, describes the access through an already existing reference. Therefore polymorphism "requires" copy by reference, so the fact that copy by value does not work is not a limitation of polymorphism, it is a limitation of the memory model that is not intendet to be solved by polymorphism.

To the contrary, even in other memory models, polymorphism does not concern itself with memory allocation. Even with classes, the allocation is not polymorphic. You can't allocate a TStringlist by calling a constructor of TStrings. Polymorphism can only be used after the type specific constructor/allocator was called.

So while you can say it's a limit to polymorphism in static memory, because you can't do polymorphism without having memory allocated, it is not a limit of polymorphism in static memory. Because how you allocate memory is not an issue polymorphism is designed to solve.

So copy by value is something I simply do not expect to work for polymorphic types, because this is something polymorphism does not try to address. Personally I think that even this:
Code: Pascal  [Select][+][-]
  1. TBase=object;
  2. TChild=object(TBase); // No extra data
  3.  
  4. function Child: TChild;
  5. [...]
  6.  
  7. var
  8.   b: TBase;
  9. begin
  10.   b := Child;
  11. end;
should not be allowed because copy by value of two different types should not be allowed.

But only if you know the size of the biggest possible derived object.
And that size can not be known in all cases (examples given early on in the discussion)
Well as pascal is a statically typed language, so one could write a preprocessor that automatically generates a type where all derivates of the superclass fit in, as their sizes need to be known during compiletime. But this would be just a form of hack. So yes, if you want to use static memory like the stack, the size of the object needs to be static and known at compiletime, or to be more precise, needs to be known at the time the function is written.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 25, 2020, 04:34:19 pm
@PascalDragon
Pls read my post again (I have eddited cpl of times) how does C++ do all that within stack-area and get away with it??
Pretty simple, it's a seperation of concerns. You can put classes wherever you want, heap, stack, data, etc. But you can't simply put everything on the stack and expect it to work. You must choose the correct allocation method is for the data at hand.

In C++ you need to think about what you need, the data segment is a static memory with global lifetime, this means placing an object here will ensure that it will never be removed from memory, but at compile time it must be known what objects are placed here (as each requires it's own global variable).
The stack is a semi static memory with limited lifetime. Objects placed here will be gone as soon as the function returns. Also most objects on here are known during compiletime, but using alloca and VLAs you can actually allocate memory dynamically on the current stack frame, something fpc does simply not support. But still you are restricted to allocating memory on the current stack frame, i.e. a function can only dynamically allocate memory on the stack that is life during it's lifetime. The heap is dynamically and manually managed, meaning the memory get's allocated and freed when you request it (smart pointer/reference count can automate this) and lives as long as you wish.

So for your classes it does not matter in C++ where they are placed, but you need to know how you want to use them and decide for this where they should be placed.
For example if the exact type and size of data you allocate is not known previously, the data segment is off the table, because here you need to define at compile time which objects reside here. The stack, while being inconvinient can handle dynamic allocations using alloca, and the heap is free for all.
If your data should survive the return of a function, the stack is off the table, data and heap can easiely survive as long as the program runs.
If you need to allocate a lot of data, but don't want to waste memory all the time, i.e. free as soon as possible, the data segment is off the table as it lives from start to finish, and the stack might also be off the table, because the current function might live longer than the individual objects allocated here.

The heap is always the *safe* solution, as it has pretty much no restrictions. The problem here is, that you need to manually allocate and free the data, as well as that heap allocation is pretty slow (compared to alloca where dynamic allocation on the stack is pretty much just a single subtraction of the stack pointer). This is why many high level languages only use the heap. Sure it adds overhead, but as long as you don't care for every bit of performance, it is the safe alternative which gives you the most freedom.
There are other options like a stack allocator placed in the heap, but thats a whole different topic.

It should be noted that while alloca allows for dynamic allocation on the stack, it is really messy and should be avoided if not neccessary.

So to summarize, in C++ it just allows you to do more, but you need to know the limitations of each of your options. The stack is no penecia, you can't just place everything on the stack and think that it will work. Each allocation methods has it's advantages and limitations, and a big part of C++ learning, where beginners often struggle with, is learning what to use when and how
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 08:17:31 pm
@Warfley

And to which extent is FPC different to C++ in terms of MemoryModel? (no sarcastic comment!)

Like what are there more of limitations which cant be just added, like I have hard times really understanding that objects are not capable of ineriting records, where da hell is the difference, like they can both be placed on stack, both can have functions/procs, both can contain pointer to other data, both cannot implement interfaces the ONLY difference is that object can inherit other objects thats it, for some reason, im sry @PascalDragon i cannott believe that they are soooooo much different in terms of implementation.. if this is the only thing they are different of.

But tbh, despite the fact that objects do currently not allow to be used as management operators (which doesnt make sense for me either, thw same optimization rules work for records as for them) they are kind of like C++, actually abuit better, since u have the more convient way of saying, i want my data on heap with non-pointer-semantics (aka classes) and on the other hand, you have the ability to have objects on the stack if that makes sense for ur application, (objects(if  polymorphism needed))/records)
Title: Re: How optimized is the FPC compiler
Post by: Awkward on December 25, 2020, 09:19:49 pm
Shpend, you forgot one detail when comparing records with objects: by nature, objects can have virtual methods, not only direct inheritance. and vice versa, records can have helpers (still don't know why it not exists for objects)
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 09:45:11 pm
Yea even that, @Akward. IMHO, objects are abit less powerful classes for the stack(disregarding the heated conversation between @Warfley and @Martin_Fr, I am just simply sayingthey are more welcomed to the stack than anything else :-D to keep it simple) but to write a Fazit, they now only need to have the ability to be extended thru helper (as you stated) and the entire managementstuff (i would love the move-semantics to find its way for records and objects then) and you hacve basically IMHO a better memory model than C++. I personally find that tho..
Title: Re: How optimized is the FPC compiler
Post by: nanobit on December 25, 2020, 10:00:46 pm
I am just simply saying they are more welcomed to the stack than anything else

objects are for stack and heap memory, just like records do:
a: array of TSomeObject
a: array of TSomeRecord
setLength(a, count);
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 25, 2020, 10:02:24 pm
@Warfley

And to which extent is FPC different to C++ in terms of MemoryModel? (no sarcastic comment!)

Like what are there more of limitations which cant be just added, like I have hard times really understanding that objects are not capable of ineriting records..
To that I wanna add, that internally, the objects just get what a record already have, its fields and methods, thats it, it doesnt need to bother5 for any VMT (from the record i mean which it inherits of) or anything else regarding Auto-Refcounting, since the record doesnt allow interfaces aswell(which makes sense) and so it shouldnt be much of a trouble to allow to just extend the record(s) its inheriting from.
Title: Re: How optimized is the FPC compiler
Post by: lucamar on December 25, 2020, 11:23:40 pm
To that I wanna add, that internally, the objects just get what a record already have, its fields and methods, thats it, it doesnt need to bother5 for any VMT (from the record i mean which it inherits of) or anything else regarding Auto-Refcounting, since the record doesnt allow interfaces aswell(which makes sense) and so it shouldnt be much of a trouble to allow to just extend the record(s) its inheriting from.

It's just the other way around: (advanced) records are now getting some features from objects (mainly methods) but, because the lack of inheritance (and other limitations, like lack of virtual methods) they can have a much simpler implementation, without VMTs, etc. than old-style objects.

Old-style objects were an evolution of records, but of the simple, standard records; and were born as a way to introduce OOP into Pascal; as such, they needed a full new way to implement them, much more as they themselves evolutioned, allowing public/private sections, virtual and static methods, etc. which made them incompatible with ... let's call them "old-style" records.

Besides that, they fill, along with classes, conceptually close but different programming "niches" and their respective internal implementations are very different, due in part to this and to their different history.

Advanced records, in turn, were born of the realization that with a few, relatively simple extensions one could get some of the vaunted advantages of OOP without, in fact, using OOP at all. So they have little to do with objects, much less classses, other than a similar sintax.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 26, 2020, 05:00:40 am
And to which extent is FPC different to C++ in terms of MemoryModel? (no sarcastic comment!)

Like what are there more of limitations which cant be just added, like I have hard times really understanding that objects are not capable of ineriting records, where da hell is the difference, like they can both be placed on stack, both can have functions/procs, both can contain pointer to other data, both cannot implement interfaces the ONLY difference is that object can inherit other objects thats it, for some reason, im sry @PascalDragon i cannott believe that they are soooooo much different in terms of implementation.. if this is the only thing they are different of.

There is one limitation that Pascal through the structure of the language has, that C++ simply does not have with regards to the memory model, and this is scoping.
Take this for example:
Code: C  [Select][+][-]
  1. void Foo() {
  2.   // A lot of code
  3.   {
  4.     SomeObject obj;
  5.     // Some code
  6.   }
  7.   // A lot of code
  8. }
obj's lifetime is restricted to the block it is located in, i.e. from the point of where it is defined, to the } of the block it is defined in. In pascal every variables lifetime is always the whole function it is defined in.
This has some non-trivial implications. C++ automatically calls the destructor, as well as, if no other constructor is explicetly used, also the constructor that takes no arguments. Meaning the destructor in C++ and the no argument constructor are comparable to the management operators in advanced records.
This allows for the following constructs:
Code: C  [Select][+][-]
  1. void foo() {
  2.   {
  3.     std::ofstream file_stream("foo.txt");
  4.     file_stream << "foo\n";
  5.   }
  6.   // a lot of code
  7. }
The constructor here opens the file foo.txt, and the destructor automatically closes it when the } is reached. This means the file is closed during the "a lot of code" section.
In pascal this would not be possible this way, because the lifetime of a variable ends with the end of the function, if the same mechanism would be used (i.e. management operators), the file would be kept open during all of the other code section.
That of course gives C++ in that regard a lot more control over the lifetime of objects. Another thing is the initialization/constructor:
Code: C  [Select][+][-]
  1. if (condition) {
  2.   SomeObj obj;
  3.   //...
  4. }
In C++ obj would only be initialized if the condition is true. In pascal the object must be initialized when the function starts and finalized when the function returns.

This has some implications. 1. through the plaicing of blocks you can explicetly define when and where the objects initialization and finalization code is gonna be called and 2. you don't need try-finally anymore (in fact try-finally does not exist in C++) as the destructor is called like a finally block.

If you want the same level of control in pascal you can not use management operators, but must resort to manual constructor and destructor calling, like it is done with old style objects:
Code: Pascal  [Select][+][-]
  1. procedure Foo();
  2. var
  3.   file_stream: OStream; // just pretend there is an object or record like this
  4. begin
  5.   // A lot of code
  6.   file_stream.init('foo.txt');
  7.   try
  8.     file_stream.WriteLn('foo');
  9.   finally
  10.     file_stream.close;
  11.   end;
  12.   // A lot of code
  13. end;
And this is the strength of C++ classes and it's memory model. You can archive the same level of control over the lifetime with much less code. Because at this point, where you manually have to call the constructor and destructor, the only advantage a stack object has over a heap based class, is a tiny bit of performance.
Personally I think in most situations clean code is more important than performance. And while C++ has a lot of things that make code really hard to read and understand, it's scope based lifetime of objects is a massive advantage to keeping your code clean. And this is something that is just by the language design never possible in pascal.

That said, often enough I think having objects live to long, or being initialized even if it is not necessary is only a minor drawback in performance. And if the performance does not matter, management operators allow for much cleaner code, as their C++ equivalent. You just loose some lifetime control and performance. But I would argue that most of the time this does not matter.
And in fact, I already build a file stream record type like the C++ example above using management operators. By putting all the file related stuff in it's own function, you still guarantee the file is not open unessecarily long, while simultaniously getting rid of the try-finally block and the manual calling of the destructor, massively cleaning up the code:
Code: Pascal  [Select][+][-]
  1. procedure CopyFile(const source, target: String);
  2. var
  3.   fr: TFileReader;
  4.   fw: TFileWriter;
  5.   line: string;
  6. begin
  7.   fr.Open(source);
  8.   fl.Open(target);
  9.   for line in fr.lines do
  10.     fl.WriteLn(line);
  11. end;
And this is why I love management operators (just an example, I actually implemented a more efficient copyfile function that does not work on a line basis but on a fixed size buffer basis)

About the internal implementation of records and objects. You must see it in a different way, advanced records are, with respect to the lifetime of the pascal language, fairly new. First there were records, then there were objects, then classes and then advanced records.
So while today the functionality of objects and advanced records might be quite similar, they developed completely differently.
I don't know how they are implemented in the FPC, but I can fully imagine that, as at first there was no intention to add features to records, that records and objects are completely differently implemented, and if you start from a different code base, it is going to develop differently in the future.

Also, what you should not forget, there are 2 different features, records can be variant, i.e. have different fields sharing the same memory:
Code: Pascal  [Select][+][-]
  1.   TColorRec = record
  2.   case Boolean of
  3.   True: (color: TColor);
  4.   False: (R, G, B, A: Byte);
  5. end;
  6. ...
  7. colRec: TColorRec;
  8. ...
  9. colRec.R := 255;
  10. colRec.G := 255;
  11. colRec.B := 255;
  12. colRec.A := 0;
  13. Form1.Color := colRec.color
and a "unique" feature (compared to records) of objects is the usage of a virtual method table. So they are inherently different. So honestly, I don't have any doubt when PascalDragon says they are implemented differently that they are. They where historically completely different and it took a long time before they became so similar they are today.
For example, if I remember correctly, even though advanced records where already a thing, it took a while before records got constructors and with them the new(obj, constructor) syntax we know from objects. Originally they were never intendet to be so similar
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 26, 2020, 10:03:28 am
thanks for the detailed explanation, @Warfley, the thing only is it could be really complete the niche of old-style-objects, when it would have :

Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on December 26, 2020, 10:44:22 am
Like what are there more of limitations which cant be just added, like I have hard times really understanding that objects are not capable of ineriting records, where da hell is the difference, like they can both be placed on stack, both can have functions/procs, both can contain pointer to other data, both cannot implement interfaces the ONLY difference is that object can inherit other objects thats it, for some reason, im sry @PascalDragon i cannott believe that they are soooooo much different in terms of implementation.. if this is the only thing they are different of.

Believe what you want. I know the compiler's code, you don't.

  • type helper

Support for type helpers for object types is somewhere on my todo list for the sake of completeness.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 26, 2020, 11:06:45 am
Quote
Believe what you want. I know the compiler's code, you don't.
Well, this is true but as I stated multiple times, there is nothing wrong considering addition to the compiler which would make sense, obviously you have to check for yourself how hard it would be to add them but its far away from being an "inappropiate/unlogical Addition".
This aside for a moment, but atleast the more important stuff are the mangement operators, this is more a must have for completion than typehelpers, since they offer way more opputornities than type-helper.
Title: Re: How optimized is the FPC compiler
Post by: beepee on December 26, 2020, 04:40:06 pm
Hi,
 I generally notice the executables compiled by Free Pascal are over twice as big and run about half the speed compared with the same code compiled by Delphi7.
Except for my graphic programs, there the Free Pascal executable is about same speed and has less bugs than Delphi7
(my delphi7 is from ~2005, so quite old and 64 bits things are not well supported (e.g. seek() )in this version)

Edit:
I apologize, using the fpc320 version, I see the FreePascal is slightly faster, but code size still bigger. And to @Shpend below: yes, all optimizations I can find are used, no debug

To @Handoko below: I will take a look at Build Modes, but I am used to compile from the command line with fpc.exe that uses an fpc.cfg where debug info is off and optimize and strip are set already. And OK, somewhat bigger code is not a big problem nowadays.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on December 26, 2020, 05:35:38 pm
@beepee

did u look if u were compiling with Debug? and were the platform target the same?
Title: Re: How optimized is the FPC compiler
Post by: Handoko on December 26, 2020, 05:42:19 pm
Lazarus / Free Pascal generate bigger binaries because it compiles to multiple platforms, so some extra codes need to be added. Also it is not fair to compare with Delphi 7 because it supports new features, I'm not very sure maybe unicode, etc.

Because the compiler supports several optimization techniques.  The size difference won't be noticeable on large projects.

But, maybe you haven't know. The default configuration will add the debugger info to the generated binary. If you use Lazarus you can disable it by:
Lazarus main menu > Project > Project options > on the left side > Debugging > disable Generating Debugging Info and enable Strip Symbols

There are some other extra things to make sure you get the smallest binary, you can search the forum to know more.

To make it easier to enable/disable the configuration settings, you should enable and use Build Modes. You can check the documentation if you're interested.
Title: Re: How optimized is the FPC compiler
Post by: Warfley on December 26, 2020, 06:04:07 pm
the thing only is it could be really complete the niche of old-style-objects, when it would have :
I would not say this, because C++ has a lot of differences that not necessarily are in the memory model itself, but allow for a lot of things in C++ that makes C++ much more convinient for high performance programming.

For example in C++ there are references, which are semantically like a one time pointer (i.e. a pointer that once being set cannot be changed, and is not allowed to be null/nil) but syntactically behave like a direct access to the object:
Code: C  [Select][+][-]
  1.   int i_local;
  2.     int &i_ref = i_local;
  3.     i_ref = 42;
  4.     std::cout << i_local << std::endl;
And there is a lot of shanannigance you can do with templates, vor example variable template arguments. This for example allows for some very neat stuff like the creation of objects inplace in other datastructures.

Let me give you an example. Let's say you want to construct a large graph. In this situation, you allocate a lot of memory in a short amount of time (during construction of the graph), and deallocate all of that at once (when tearing the whole graph down).
Using the heap directly might here be too slow, as heap allocations and freeing of every node takes time, you want to allocate the memory in bulks and free it all at once. Also in a classical approach, to free the nodes, you need to traverse the tree, which is terrible for cache locality.
In this case you want to use a stack allocator. This is a datatstructure that reserves a large amount of memory and allocates objects on it like a stack. You can't free objects during the runtime, but once the stack is teared down, all objects are going to be simultaniously destroyed.
This is how you would build that in C++:
Code: C  [Select][+][-]
  1. struct TreeNode {
  2.   virtual int child_count() = 0; // abstract method
  3.   virtual TreeNode &get_child(int idx) = 0; // abstract method
  4.   virtual int get_value() = 0; // abstract method
  5. }
  6. struct TreeBranch: public TreeNode {
  7.   TreeNode &left;
  8.   TreeNode &right;
  9.   virtual int child_count() override { return 2; }
  10.   virtual TreeNode &get_child(int idx) override { return idx ? left : right; }
  11.   virtual int get_value() override { return left.get_value() + right.get_value(); }
  12.   TreeBranch(TreeNode &_left, TreeNode &_right): left(_left), right(_right) { } // constructor
  13. }
  14. struct TreeLeaf: public TreeNode {
  15.   int value;
  16.   virtual int child_count() override { return 0; }
  17.   virtual TreeNode &get_child(int idx) override { assert(false); }
  18.   virtual int get_value() override { return value; }
  19.   TreeLeaf(int _value): value(_value) { }
  20. }
  21. ...
  22. std::vector<TreeBranch> branch_memory{1024*1024*1024}; // capacity 1 GB (virtual memory)
  23. std::vector<TreeLeaf> leaf_memory{1024*1024*1024}; // capacity 1 GB (virtual memory)
  24.  
  25. TreeNode &leaf1 = leaf_memory.emplace_back(1);
  26. TreeNode &leaf2 = leaf_memory.emplace_back(2);
  27. TreeNode &leaf3 = leaf_memory.emplace_back(3);
  28. TreeNode &branch1 = branch_memory.emplace_back(leaf1, leaf2);
  29. TreeNode &root = branch_memory.emplace_back(branch1, leaf3);
  30. std::cout << root.get_value();

In this code, not a single copy (or move) operation takes place, all operations are copy by reference. std::vector<T>.emplace constructs a new element directly where it will be stored by passing the arguments given to the function 1-1 to the types constructor.

In pascal the allocator would need to only allocate the memory and return the pointer so the allocating function can manually call the constructor:
Code: Pascal  [Select][+][-]
  1. // let's assume similar definitions of the types
  2. var
  3.   branch_memory: specialize TVector<TTreeBranch>; // let's assume that a type like this exists
  4.   leaf_memory: specialize TVector<TTreeLeaf>;
  5. var
  6.   leaf1, leaf2, leaf3, branch1, root: PTreeNode;
  7. begin
  8.   leaf1 := leaf_memory.emplace_back;
  9.   leaf1^.init(1);
  10.   leaf2 := leaf_memory.emplace_back;
  11.   leaf2^.init(2);
  12.   leaf3 := leaf_memory.emplace_back;
  13.   leaf3^.init(3);
  14.   branch1 := branch_memory.emplace_back;
  15.   branch1^.init(leaf1, leaf2);
  16.   root:= branch_memory.emplace_back;
  17.   root^.init(branch1, leaf2);
  18.   WriteLn(root^.get_value);
  19. end;
You can archive the same behaviour, but it is more code and less readable. So again I am at what I said in my very first post in this thread. You can do the same in pascal as with C++, but in C++ you just write less (and cleaner) code for getting the same efficency. And this is purely due to the language design of C++.

That said, this is still a very niche thing. The example above is a reduced version of a problem I actually had to face, where the overhead of heap allocation was just too slow for my purposes (the graph that was created required multiple gigabytes in memory), so I needed to build a stack allocator.
C++ makes this very easy, because as you can see, std::vector already provides the required functionality. But a lot of programs, especially those for which Pascal is prevalently used (like most GUI programs) do not find themselves in such situations very often.

Pascal is not C++, and does not even try to be like C++. Different languages have different strengths, and honestly, C++ is much more complex than Pascal, and while it is great for such things as shown above, I would never use C++ for just building a small GUI application due to it's complexity. C++ is also much harder to learn than Pascal due to exactly this.
Different languages are like different tools. I don't need Pascal to be like C++, the same way I don't need to add a hammer head to a skrew driver. If you try to do everything, you end up being bad, or at most mediocre, at everything.

To summarize, Pascal and C++ have by their design different goals. I would argue thats a good thing. I like management operators not because they allow for efficient programming, but because they clean up the code. C++ has the intendet goal to be as efficient as possible, all language features are designed with this in mind, even if that means making the language more complicated. If Pascal would become a C++-lite, there would literally be no reason for me (and probably for most people) to use it instead of C++.
Even though some features would be nice, ultimately we are talking about a niche where Pascal is not the language of choice to begin with. Better to focus on the things Pascal is already good at, instead of trying to improve something Pascal isn't so usefull for to begin with.
Title: Re: How optimized is the FPC compiler
Post by: marcov on December 26, 2020, 09:33:54 pm
I generally notice the executables compiled by Free Pascal are over twice as big and run about half the speed compared with the same code compiled by Delphi7.

Just like more modern Delphi.  D7 is minimalist and has no deep support for unicode, or anything else after 2003 or so.  If you need to compare to a Delphi, use a recent one, not something ancient.

Also, binary size minimization is no core target at the moment, nobody wants to do complex work on it, like improving smartlinking (except Pascaldragon, occasionally)
 
Title: Re: How optimized is the FPC compiler
Post by: Shpend on September 21, 2021, 11:34:54 pm
Hey, I found this, can some1 explain more for that?

https://gitlab.com/freepascal.org/fpc/source/-/issues/35825
Title: Re: How optimized is the FPC compiler
Post by: damieiro on September 23, 2021, 06:42:50 pm
Well, i have read all the post and i have some contradictory feelings.

First of all, the basis of my argumentation:

- FPC and C, for me, have similar speeds. We can argue if move/copy, using pointers, etc, can be more convenient. But the tools are in Pascal and C. If I really go for speed, i would see (as other saids) how many memory allocs/copys/moves/pointer passing, not data/or system calls i'm being doing. And really needy speed need smart thinking. I do not see many differences between pascal and C in their *basic* mode.
As a very fast example: convert any NTree algorithm in a array algorithm and you would have same speed.

With this point of view, things like a+=b or a:=a+b; for me it's sugar syntax. I goes to the same intermediate code, then to same assembly if properly done. Same if a:=a+1 should give us an inc(a)

But i think we hace several issues related to the class/object model.

1.- We haven't a void heap Class. I haven't read about this in your posts, but Object is not TObject even if both were on heap. Object is a void one. No methods, no data, not any. Object has not an AfterDestructionMethod for example, no interface data and things like that. Object it's A lightweightversion of TObject even on heap. I we need a really void heap class, there is nothing. And i allways said that one of this kind is really needed.
2.- I think we need to determine if we want to be a 25 year compiler that mimics delphi/embarcadero with their good and bad ideas or if besides that, we want a compiler to do their own way and supporting embarcadero/delphi things.
What are the problem?. We cannot use Object for past compatibility (and even embarcadero deprecates it). Well: Use (for example) FPCObject to do new and cooler things: The new and shiny Object implementation for FPC. For use on stack with all the power of a delphi class. Or explore the way. We cannot use other thing that it's not a TObject Descendant. Well, not. We should do a TLazarusClass* (with NOTHING, the minimal one) and then derive TObject. So we can derive from a TLazarusClass* without all the bloated TObject and do newer and cooler and lighter things. And this could be with a cool syntax. Embarcadero wouldn't do it for us. And i think it's clear that we have a mess here. Let's study it and make a good object pascal learning from all our experiences. From the compiler view on how to optimice. From the syntax view to unify things and things like that. We are a community bigger than Embarcadero one and we know the tool.

* (Or other cool name).  :D

3.- I'm a FPC User. I love It. C++ should learn how to do things even with our issues. And i read some posts and seems for their tone that are like we cannot fight and things like that.. Well i think we can fight and do ever better than others, for that reason we are using fpc and lazarus, not the reverse. And i think, perhaps, it's time to think about what we want to be when we were adults, not a child from  oldies doing the same oldies mistakes.

Note: It's not a blame for fpc devs. It's a blame for us, as community to, perhaps, show what we should be the road to do and if, as community, feel confortable with the roadmap. I think many of us doesn't like the oop implementation (it's a desideration, but an open poll will give us a communty view, for example), but there is not a study group from users to do a proposal, nor a poll, not a easy way to make a proposal with a community formal review. And this will enforce a well weighted and balanced evolution with a clear objetive and not our personal tastes (yes i have my own tastes too  :P). Perhaps same with other issues: standard, additional libs, etc..

edit: As example: Figure we are working for a renewed object/class model.
We should:
1.- Do a working group-dev group to do the things. With some basic agreements:
  a) Devs will go for it. If not, there isn't a working group. I would be a theoretically study group with no really trascendence.
  b) Little bites, not big ones. All we have a live.
  c) No *mumble mumble* do your own fork. We are a community, but we aren't too numerous to split or forking and it's a bad strategy. It's better to say: There isn't enough people to do it, but if there were people we will go for it or say: allthough there were enough people for this, we think it's a bad idea for this, and this, and this. And even making a recording/faq/forum to debate these and document that

..Or something like the above if all people doing the work agree with their own terms.
Title: Re: How optimized is the FPC compiler
Post by: Shpend on September 28, 2021, 10:16:19 pm
I like your view personally, mate!

I hope the dev's really have a watch on that :)
Title: Re: How optimized is the FPC compiler
Post by: Blade on September 29, 2021, 12:36:28 am
But i think we hace several issues related to the class/object model.

Could you give your opinion on advanced records?

Clearly this is direction that Delphi/Embarcadero went, so trying to understand your direction a bit better.

Also, I'm curious if you are coming from a language that was Class-based OOP centric, so feel compelled to continue similar usage in Object Pascal versus the present possible options that it gives.
Title: Re: How optimized is the FPC compiler
Post by: damieiro on October 14, 2021, 03:39:26 pm
Quote
Could you give your opinion on advanced records
Clearly this is direction that Delphi/Embarcadero went, so trying to understand your direction a bit better.

Also, I'm curious if you are coming from a language that was Class-based OOP centric, so feel compelled to continue similar usage in Object Pascal versus the present possible options that it gives.

Advanced records, inmho, is a valid point. It's Rust view also. Does the job if you do not need inheritance in your paradigm. For an oldies view, it's a kind of sugar cake of an old unit (taking a whole unit as a advanced record). It's handy, and their potential as making helpers, generics, etc is a very valuable one. If you need encapsulation but not inheritance, i think it fits perfectly.

I do not blame vs advanced records. An OOP can benefit from them a lot, and, i think, makes a fresh tool for doing things. It makes that not-all that needs encapsulations and can be reused *must be* an object. It was allways Quirckly that, for example, system facilities (like opening/close files), system calls, simply things like a random generator, etc, or there should be a procedural call with assignments for persistent data (assign file, seeds..), or a fully object that rarely could have descendency or hybrids like procedures with const variables.. An advanced records sounds far better for many jobs.
Advantages of an advanced record (INMHO):
- Encapsulation
- Do not use inheritance and saves these space  and system and compiler overhead (no OOP tables, etc) and it's faster.
- Avoids quirks like procedures with constant values not showed (like randomize, random calls). You expect encapsulation of data in an advanced record, but not in a procedure
- Avoids quircks of assignation (like the assign a file variable) and bad style like two-calls for one service.
- Allows modern syntax like generics.. Generics is a very powerful tool.
- Can be used and reused out of the scope of classes, which is handy.
- Better mainteinance.
- Readable and enforces good practices.
- A differect tool that we don't have. And different tools is allways wellcome :)

So, i like advanced records.

On the other hand, i dislike two OOP implementations (one nearly deprecated, and other not empowered full) for the same niche of solutions. I think there is here room for improvements and rethinking.

And for the sake of completness. I'm not coming from a oop centric languaje. As many users, i started c (not c++) and Pascal and assembly (and basic  :D ). Platforms from cp/m-dos/all windows, unixes, solaris, etc. Some prolog, some fortran, but my main base is that. OOP was too modern from me, i adquire the oop base later, but i like it as a very powerful tool for large deployments.
I am firmly convinced on code readability, code security (the languaje should avoid mistakes from programmers), code efficienct and this makes all-purpose. And low level grounded compiler.
Pascal does it. Good read, strong typed, efficient, all purpose. But i think that we are forgetting low-level when we are thinking like : low-level is like a C languaje.
Well. Pascal is low level as C. Safer. Strong typed. Readable. But capable of the same speed and low resource consuption as C. If we voluntary do a TObject implementation with a higher level ground for Nothing, we are giving this ground to C++ for nothing. It's a better aproach a very base TVeryPrimitiveObject (a pure void class), then construct the more advanced TObject we are using, and we will have both advantages. The lower ground that can give us a lot of happines if smart people working in it, and the middle ground level, which is now used.
TP-old-nearlydeprecated-Object were on that lower ground. Not for stack or heaps , but for the base. The Void class as a base in tp-object. We could do a TObject from TPObject, but not the reverse. And this is the key here.

(pd: sorry for the latter answer, i'm having some health issues :( )

and one personal opinion.

If you are from the one-file one-object way of coding... The most beautiful and readable code you can achieve inmho, its FPC code. And really fast compile. And fast result.

Title: Re: How optimized is the FPC compiler
Post by: SymbolicFrank on October 28, 2021, 08:34:55 pm
If you spend half your time thinking about how to optimize your code, you could be twice as productive simply by stopping to do so. It is very rare that speed is an issue, and in almost all cases you can speed up the bottleneck by changing the underlying way you do it. Which has nothing whatsoever to do with the compiler.

Simple example: if you have a huge dataset that you filter in many places to extract the relevant information, you could speed that up a lot by using multiple, different queries and/or datatsets. And you could speed those up by using stored procs instead. That has a far greater impact on the speed of your application than any compiler optimization.

In short: always ignore speed unless it becomes a problem, at which point you should search for the bottleneck and fix only that.


Yes, I know that is the opposite philosophy of C++, where speed is always the most important consideration.

For me, the most important thing is always the readability.
Title: Re: How optimized is the FPC compiler
Post by: Seenkao on October 29, 2021, 10:06:14 am
[
If you spend half your time thinking about how to optimize your code, you could be twice as productive simply by stopping to do so. It is very rare that speed is an issue, and in almost all cases you can speed up the bottleneck by changing the underlying way you do it. Which has nothing whatsoever to do with the compiler.

Simple example: if you have a huge dataset that you filter in many places to extract the relevant information, you could speed that up a lot by using multiple, different queries and/or datatsets. And you could speed those up by using stored procs instead. That has a far greater impact on the speed of your application than any compiler optimization.

In short: always ignore speed unless it becomes a problem, at which point you should search for the bottleneck and fix only that.


Yes, I know that is the opposite philosophy of C++, where speed is always the most important consideration.

For me, the most important thing is always the readability.
И да, и нет. Это зависит от разрабатываемых приложений. Если конечный результат программа которой будут только пользоваться, и не для дальнейшей разработки с её помощью, то тут можно просто не задумываясь делать приложение и выдавать результат. Учитывая критичные места.

Но если приложение/библиотека будет использоваться для дальнейшей разработки, то в таком случае "всё должно идти с иголочки" (по возможности). И каждая вызываемая процедура должна быть отработана или в ней должно быть указание, что в ней не отработано и по какой причине.

В противном случае, все недоработки, которые несёт с собой приложение/библиотека - могут создать проблемы для программиста который пользуется данным инструментом. И программисту придётся искать другой путь, другую библиотеку или самому писать заново подобные функции/процедуры. Что означает, что вы просто заставили человека сделать дополнительную работу, хотя он хотел заниматься разработкой основного приложения.

Что по поводу читабельности - это может быть воспринято по разному. Если это относится к форматированию текста и документированию его, то да. Если же это относится к тому, что компьютер должен понимать что пишет человек - то нет! Причём совсем нет!
Если мы пишем программу легко воспринимаемую компьютером, но достаточно непросто воспринимаемую программистом (и совсем не воспринимаемую обычным человеком), то это ближе к развитию и человека и программы для компьютера.
Если же мы пишем программу, которую достаточно просто поймёт обычный человек, но "сломает ногу" машина, для которой мы это писали - то это деградация и программиста и машины. Вы не получите должного результата. И будете дальше надеяться на мощности вашего компьютера, а он будет "тормозить".

Yandex translate:
Yes and no. It depends on the applications being developed. If the end result is a program that will only be used, and not for further development with its help, then you can just make an application without hesitation and give the result. Considering critical locations.

But if the application / library will be used for further development, then in this case "everything should go from scratch" (if possible). And each procedure called must be worked out or it must indicate what has not been worked out in it and for what reason.

Otherwise, all the flaws that the application / library brings with it can create problems for the programmer who uses this tool. And the programmer will have to look for another way, another library, or re-write similar functions/procedures himself. Which means that you just forced the person to do extra work, even though he wanted to develop the main application.

What about readability - it can be perceived in different ways. If this applies to formatting text and documenting it, then yes. If this refers to the fact that the computer must understand what a person is writing, then no! And not at all!
If we write a program that is easily perceived by a computer, but is not easily perceived by a programmer (and not at all perceived by an ordinary person), then this is closer to the development of both a person and a computer program.
If we write a program that an ordinary person will understand quite simply, but the machine for which we wrote it will "break its leg", then this is the degradation of both the programmer and the machine. You will not get the proper result. And you will continue to rely on the power of your computer, and it will "slow down".

-----------------------------------------------------------------------------------
Я протестировал программы (и буду дальше тестировать), посмотрел определённые места компилятора. Оптимизация самого компилятора FPC застопорилась примерно лет 10 назад. Видимо ею ни кто не занимался и все полагаются на ресурсы компьютера. Причём, для Windows компилятор FPC более сильно оптимизирован, чем для Linux (и вероятно для других систем так же - менее оптимизирован). В некоторые моменты поведение компилятора непредсказуемо. Компилятор может выполнить оптимизацию, а в следующий момент не выполняет (допустим, в разных процедурах работая со статическими данными). Я ещё далеко не достаточно глубоко залазил внутрь компилятора и не полностью рассмотрел смотрел что он делает.

В ряде случаев, вы можете отказаться от "String" паскаля и работать с текстом сами. Компилятор зачастую (если вы сами хотите производить какие-то действия с текстом) добавляет код, который несёт дополнительные расходы (это правильно для большинства! Редко, но некоторые люди хотят контролировать сами процесс работы с текстом).
Функция StrToInt устарела (вероятно и StrToFloat тоже, не проверял). Средствами паскаля (даже не ассемблера) её можно ускорить минимум в два раза, если не больше. Функцию IntToStr с помощью паскаля не ускоришь, добавляются накладные расходы на текст, и в данном случае компилятор достаточно неплохо справляется с текстом.

Компилятор FPC уже достаточно давно надо пересматривать в сторону оптимизации. Какие-то части он оптимизирует достаточно неплохо, а какие-то просто игнорируются и рассматриваются обычной последовательностью кода.

yandex translate:
I tested the programs (and will continue to test), looked at certain places of the compiler. Optimization of the FPC compiler itself stalled about 10 years ago. Apparently, no one was engaged in it and everyone relies on computer resources. Moreover, the FPC compiler is more highly optimized for Windows than for Linux (and probably less optimized for other systems as well). At some points, the compiler's behavior is unpredictable. The compiler can perform optimization, and the next moment it does not (for example, working with static data in different procedures). I haven't gone deep enough into the compiler yet and haven't fully considered what it does.

In some cases, you can abandon Pascal's "String" and work with the text yourself. The compiler often (if you want to perform some actions with the text yourself) adds code that incurs additional costs (this is correct for most! Rarely, but some people want to control the process of working with the text themselves).
The StrToInt function is outdated (probably StrToFloat too, I didn't check it). By means of Pascal (not even assembler), it can be accelerated at least twice, if not more. You can't speed up the IntToStr function with pascal, text overhead is added, and in this case the compiler copes with the text quite well.

The FPC compiler has had to be revised in the direction of optimization for a long time. He optimizes some parts quite well, and some are simply ignored and considered by the usual sequence of code.
Title: Re: How optimized is the FPC compiler
Post by: SymbolicFrank on October 29, 2021, 08:28:14 pm
Ok, a long reply.


Stack and heap.

Ok, much of the things I know about how they are handled are old and probably out-of-date. Please correct me where needed.

Each process gets a stack allocated from the OS. You can specify the min and max sizes. The stack is used for parameters and local vars. Allocation is normally static and de-allocation is optional, because each time a function is called a block is reserved and when it exits, that block is discarded. Of course, the application can allocate a block for its own use on startup, by moving the stack pointer downwards.

All global vars that are known and allocated at compile time are put in the code segment. You can allocate them dynamically on startup if needed.

For dynamic allocations and data management you use the heap. There might be a minimum and/or maximum size to the blocks you can request from the OS, but they can be allocated, discarded, grow and shrink during execution.

On the stack, everything is either relative to the top of the stack, or the stack pointer. You cannot de-allocate memory, but it is possible to allocate it. If you allocate and de-allocate those blocks of memory dynamically, the stack keeps growing.

Now is that not as much a problem as it seems, because of virtual memory. Request a minimum stack size of 2GB, and you get an address range of that size, with one page of RAM mapped to the top. The rest of the range is unpopulated with actual RAM until you write something to it, at which point the OS plugs a new page of RAM at that location. Then again, while you could mark pages disposable or remove them yourself, I don't think that is something you should do.

So: pre-allocated local vars: yes. dynamic memory management on the stack: no.

Memory management on the heap has been inefficient for small blocks. So, compilers tend to use a runtime that has its own memory manager built in. Allocate large blocks from the OS, allocate small stuff yourself inside that large block. Both are fully dynamic.

Each OS has its own implementation of the heap manager. Often more than one as well. Like, Windows has 3 or 4 different ones, depending on how you count them. And they all behave differently.

Why is it all so complex? Because of memory models and segmentation.

On a small CPU, with less than 64 kB of memory that runs a single process at a time, you load the static data at the bottom, the code goes on top of that and the stack starts at the end of memory and grows down. Everything in between is the heap, which you have to manage yourself. If the top of the heap meets the stack pointer, you're out of memory.

So, in this case, it is vastly better to allocate everything on the heap if you want to do any dynamic memory management. Ok, you need pointers to pointers to be able to move stuff around.

But, in general, allocations on the stack are either static or with a predefined size. Allocate three records, free the first and allocate a new one and you have the space of four in use.

When registers and RAM grow in size, we get bank switching and/or segments. You can put the whole address space of an application in a single bank or segment, or you can give it multiple. Popular are the division between code, data and stack.

If the max size of the data (heap) and stack banks or segments are the same, it becomes interesting to put data on the stack. You might even be able to use the bottom of the stack segment/bank as an extension of the heap. And probably the top of the code segment as well.

But if you have a 64-bit application, does it still matter? Everything is mapped linearly in a huge address space, where pages of 4kB (or sometimes 1 MB) of ram are inserted wherever they are used. You can give each function their own stack of multiple GB. It doesn't matter. And the memory management is basically the same for heap and stack, with the only difference that you cannot discard stack memory.

The application requests a large block of memory from the OS and starts handing out blocks on request. And while keeping track of static data and discarding stuff on the stack is easier, every time you access a location, a page of RAM is mapped into it. The OS might free pages below the stack pointer, but it might not. Because many applications use that space to store data. And the OS has no way to know if it is used or discarded.

The heap is something else. You can simply grow the heap almost indefinitely. Holes larger than a page are discarded by the OS and the RAM released. The only problem is fragmentation, if you have a large amount of pages that contain just a few small blocks of data but are mostly empty. The application can move them around, if you use pointers to pointers.

So, what is faster? What is more memory efficient? It depends on your CPU architecture and memory model and if it uses bank switching, segmenting or paging. There is no general statement you can make about it for FPC executables, which run on just about anything.


Strings, records objects and classes,

Let's start with the basics: accessing blocks of memory in C(++) sucks. It's really unsafe and/or very slow. I mean, it is very fast, if you never ask for the length and don't care if you accidentally read or write past the top of the buffer. The best example are, of course, strings. So, you want to use containers that have boundaries built in, like records or objects. And if you use methods to access them, you can have them do the bounds checking for you.

Now, classes and generics in C++ were implemented as templates at the start. It's a macro. It gets expanded and the compiler tries to compile the result. The good: all datatype specific considerations are used. The bad: it's hard to say if the resulting code does what you intend it to do.

For example, does the destructor actually runs? And when? Probably not when an exception occurs (no try .. finally). Constructors and destructors shouldn't have parameters for the best results. So it makes a lot of sense to put as much as possible on the stack, where they are wiped out automatically when needed.

Do we really want to copy that behavior?


Ok, my last point: you can implement almost anything you want in almost every programming language. But it might make things (a lot) easier or harder, depending.

When I have a project, first I think about how I want it to work. Like, how should the high-level stuff look like? What interfaces do I want for the medium level stuff? And with an interface, I mean: everything you program around from two or more sides. So, a function with parameters and/or a result, is an interface. You can specify it as part of your design.

And for the low-level stuff, I make a plan as well. Most of the time, I totally don't care where and how data is allocated. But sometimes I do. And in those cases, I make sure it is programmed like that. And I don't really care if that requires the use of malloc and pointers, or records, or objects (classes).

But I do care very much about the ease of use and readability. If you choose a complex way to do it, make sure it can be easily tested and is encapsulated in an easy-to-use package.
Title: Re: How optimized is the FPC compiler
Post by: Seenkao on October 29, 2021, 09:14:38 pm
SymbolicFrank, прочитал больше как информацию для ума.  :)
По большей части мне добавить нечего, определённую часть вы знаете лучше меня! И вероятно я перечитаю это ещё раз.

Но в данном случае я больше говорю о банальной оптимизации! Как вы упоминали. По мелочи:
- объявления глобальных переменных - как результат увеличение скорости работы приложения и уменьшения исполняемого кода.
- определённые функции и процедуры давно должны быть переработаны. Так как создавались они ещё при царе Горохе.
- работа со статическими данными - так же ускорение работы кода и его уменьшение. FPC не гарантирует ни как что при встрече статического блока, он его преобразует и уменьшит. Это как в рулетку играть - либо да, либо нет.

Yandex translate:
Symbolic franc, I read more as information for the mind.  :)
For the most part, I have nothing to add, you know a certain part better than me! And I'll probably read it again.

But in this case I'm talking more about banal optimization! As you mentioned. For small things:
- declarations of global variables - as a result, an increase in the speed of the application and a decrease in executable code.
- certain functions and procedures should have been redesigned long ago. Since they were created during the reign of Tsar Peas.
- - working with static data - also speeding up the code and reducing it. The FPC does not guarantee that when a static block is encountered, it will transform it and reduce it. It's like playing roulette - either yes or no.

as an example:
Code: Pascal  [Select][+][-]
  1. const
  2.   One = 1;
  3.   Two = 2;
  4.   Three = 3;
  5.   Four = 4;
  6.   OneAndThreeOrFour = One And Three or Four;
  7. ...
  8. var
  9.   z: longword;
  10. begin
  11.   z := One And Three or Four;  // does not guarantee that it will be converted
  12. // I have to do it manually ->
  13.   z := OneAndThreeOrFour;
  14. end;
  15.  
Таких недочётов скопилось не мало. Я не думаю что я смогу их все выявить, тут нужен не один человек, а группа, которая будет выявлять и устранять недочёты.

По C/C++. Стремиться к C/C++ у меня нет желания. Я не говорю, что это плохие языки, но мы должны понимать, что Pascal и C - это разные языки и не надо из Pascal лепить очередной C.

Yandex translate:
There are a lot of such shortcomings. I do not think that I will be able to identify them all, there is not one person needed here, but a group that will identify and eliminate shortcomings.

By C/C++. I have no desire to strive for C/C++. I'm not saying that these are bad languages, but we must understand that Pascal and C are different languages and it is not necessary to mold another C from Pascal.
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on October 30, 2021, 03:36:53 pm
I tested the programs (and will continue to test), looked at certain places of the compiler. Optimization of the FPC compiler itself stalled about 10 years ago. Apparently, no one was engaged in it and everyone relies on computer resources.

I really wonder how you come to that conclusion, cause if you look at the history of e.g. the x86 specific optimizations (https://gitlab.com/freepascal.org/fpc/source/-/commits/main/compiler/x86/aoptx86.pas) you can see that it's actively worked on, same for other platforms. Only because what you consider a good optimization is not done does not mean that no optimizations are done, because it all depends on what the devs priortize.

In some cases, you can abandon Pascal's "String" and work with the text yourself. The compiler often (if you want to perform some actions with the text yourself) adds code that incurs additional costs (this is correct for most! Rarely, but some people want to control the process of working with the text themselves).
The StrToInt function is outdated (probably StrToFloat too, I didn't check it). By means of Pascal (not even assembler), it can be accelerated at least twice, if not more. You can't speed up the IntToStr function with pascal, text overhead is added, and in this case the compiler copes with the text quite well.

The point of Pascal is ease of use and not to squench the last bit of performance out of the code. If you need that you either drop to a lower level and use e.g. PChar instead of AnsiString or simply use a different language that generates code as you like it (e.g. C++). Also this ease of use also includes maintainability of both the compiler and the RTL. Sure you can write StrToInt in assembly for every platform, but then this means a higher maintainance burden.
Title: Re: How optimized is the FPC compiler
Post by: SymbolicFrank on October 30, 2021, 09:28:38 pm
Optimizing things to the max requires selecting a single implementation first. Like in the example of PascalDragon, the CPU. Or, in my example, the memory model. The more you focus the usage, the better it can be optimized.

But, as PascalDragon said, the more platforms you have, the more you need to support. Rolling out a change isn't simply changing one block of source code anymore, but requires a revision of each and every different implementation.

A stupid example: how are you going to use a TStream? Big-endian, little-endian or UTF-8 network communication? Let alone the C# WCF way, where you push the object through in IL code and run that on the target platform. Who is going to make all those translations?

And then I forget the endpoints. Is it a file, a socket, a purely serial communication, USB master or slave? They're all quite different. Some always accept input, but most tell you to wait. Some can request communication whenever the need arises, but others have to wait and buffer until the master requests an update.

So, the trick is in making something that is understandable and "good enough" in general. And if it turns out to be the bottleneck for your application, it is up to you to make that highly specialized and optimized case. And it would be nice if you tell us why and how you did it, so we can learn from it.
Title: Re: How optimized is the FPC compiler
Post by: Seenkao on October 31, 2021, 12:55:42 am
I tested the programs (and will continue to test), looked at certain places of the compiler. Optimization of the FPC compiler itself stalled about 10 years ago. Apparently, no one was engaged in it and everyone relies on computer resources.

I really wonder how you come to that conclusion, cause if you look at the history of e.g. the x86 specific optimizations (https://gitlab.com/freepascal.org/fpc/source/-/commits/main/compiler/x86/aoptx86.pas) you can see that it's actively worked on, same for other platforms. Only because what you consider a good optimization is not done does not mean that no optimizations are done, because it all depends on what the devs priortize.
Я ни кого не хотел задеть своими словами! Я просто слишком прямолинеен, это моя достаточно плохая черта.
Переходя по ссылке мы увидим, что какими-то определёнными оптимизациями занимается в основном один человек (остальных не видно, но это не значит что они ни чего не делают). Но один человек - это очень мало. Учитывая сколько платформ поддерживается.

google translate:
I didn't want to hurt anyone with my words! I'm just too straightforward, that's my bad enough trait.
By clicking on the link, we will see that basically one person is engaged in some specific optimizations (the rest are not visible, but this does not mean that they are not doing anything). But one person is very little. Considering how many platforms are supported.

Quote
In some cases, you can abandon Pascal's "String" and work with the text yourself. The compiler often (if you want to perform some actions with the text yourself) adds code that incurs additional costs (this is correct for most! Rarely, but some people want to control the process of working with the text themselves).
The StrToInt function is outdated (probably StrToFloat too, I didn't check it). By means of Pascal (not even assembler), it can be accelerated at least twice, if not more. You can't speed up the IntToStr function with pascal, text overhead is added, and in this case the compiler copes with the text quite well.

The point of Pascal is ease of use and not to squench the last bit of performance out of the code. If you need that you either drop to a lower level and use e.g. PChar instead of AnsiString or simply use a different language that generates code as you like it (e.g. C++). Also this ease of use also includes maintainability of both the compiler and the RTL. Sure you can write StrToInt in assembly for every platform, but then this means a higher maintainance burden.
Основной смыл моих слов был в том, что я написал слово: Паскаль.  :) И я прекрасно выразился, что сам паскаль меня устраивает! Оптимизация, как я считаю, не достаточная. И больше банальная оптимизация. Где явно видно, что ни какого изменения в коде не будет в процессе работы программы, но компилятор не преобразует такие моменты. Или преобразует, но не всегда или частично. Или вообще только для одной платформы - Windows.

Было бы интересно узнать, а в чём разница между Windows 64 bit и Linux 64 bit, которые идут на одной платформе x86? Почему для Windows оптимизация идёт, а для Linux нет? Примеры с Single/Double, которые используются при вызовах процедур и вычислении определённых данных - думаю достаточно яркие. Windows - оптимизация работает. Linux - оптимизации нет. (Извиняюсь, что вновь поднимаю этут тему, но как пример неплохо подойдёт. (https://forum.lazarus.freepascal.org/index.php/topic,56719.0.html))

Теперь то, что явно не оптимизировано.

Google translate:
The main meaning of my words was that I wrote the word: Pascal. :) And I put it perfectly that Pascal himself suits me!Optimization, in my opinion, is not sufficient. And more banal optimization. Where it is clearly seen that no change in the code will be in the process of the program, but the compiler does not transform such moments. Or transforms, but not always or partially. Or generally only for one platform - Windows.

It would be interesting to know what is the difference between Windows 64 bit and Linux 64 bit, which are on the same x86 platform? Why is there optimization for Windows, but not for Linux? Examples with Single / Double, which are used when calling procedures and calculating certain data - I think they are quite bright. Windows - optimization works. Linux - no optimization. (Sorry to bring this up again, but as an example it will work well. (https://forum.lazarus.freepascal.org/index.php/topic,56719.0.html))

Now something that is clearly not optimized. StrToInt example:
Code: Pascal  [Select][+][-]
  1. const
  2.   isByte      = 0;                 // len = 3                0..255
  3.   isShortInt  = 4;                 // len = 4                -128..127
  4.   isWord      = 1;                 // len = 5                0..65535
  5.   isSmallInt  = 5;                 // len = 6                -32768..32767
  6.   isLongWord  = 2;                 // len = 10               0..4294967295
  7.   isInteger   = 6;                 // len = 11               -2147483648..2147483647
  8.   {$If defined(CPUX86_64) or defined(aarch64)}
  9.   isQWord     = 3;                 // len = 20               0..18446744073709551615
  10.   isQInt      = 7;                 // len = 20               -9223372036854775808..9223372036854775807
  11.   {$IfEnd}
  12.  
  13. type
  14.   geUseParametr = record
  15.     maxLen: LongWord;
  16.     {$If defined(CPUX86_64) or defined(aarch64)}
  17.     maxNumDiv10: QWord;
  18.     maxNumeric: QWord;
  19.     {$Else}
  20.     maxNumDiv10: LongWord;
  21.     maxNumeric: LongWord;
  22.     {$IfEnd}    
  23.  
  24. var
  25.   resInt64: Int64;  // integer ???
  26.  
  27. procedure SetNumberParametr;      //call at the very beginning. This is to speed up translation work.
  28. function geStrToInt(Str: String; Size: LongWord = isInteger): Boolean;
  29.  
  30. implementation
  31.  
  32. var
  33.   allUseParametr: array[0..7] of geUseParametr;
  34.  
  35. procedure SetNumberParametr;
  36. begin
  37.   allUseParametr[isByte].maxLen := 3;
  38.   allUseParametr[isByte].maxNumeric := 255;
  39.   allUseParametr[isByte].maxNumDiv10 := 25;
  40.   allUseParametr[isShortInt].maxLen := 4;
  41.   allUseParametr[isShortInt].maxNumeric := 127;
  42.   allUseParametr[isShortInt].maxNumDiv10 := 12;
  43.   allUseParametr[isWord].maxLen := 5;
  44.   allUseParametr[isWord].maxNumeric := 65535;
  45.   allUseParametr[isWord].maxNumDiv10 := 6553;
  46.   allUseParametr[isSmallInt].maxLen := 6;
  47.   allUseParametr[isSmallInt].maxNumeric := 32767;
  48.   allUseParametr[isSmallInt].maxNumDiv10 := 3276;
  49.   allUseParametr[isLongWord].maxLen := 10;
  50.   allUseParametr[isLongWord].maxNumeric := 4294967295;
  51.   allUseParametr[isLongWord].maxNumDiv10 := 429496729;
  52.   allUseParametr[isInteger].maxLen := 11;
  53.   allUseParametr[isInteger].maxNumeric := 2147483647;
  54.   allUseParametr[isInteger].maxNumDiv10 := 214748364;
  55.   {$If defined(CPUX86_64) or defined(aarch64)}
  56.   allUseParametr[isQWord].maxLen := 20;
  57.   allUseParametr[isQWord].maxNumeric := 18446744073709551615;
  58.   allUseParametr[isQWord].maxNumDiv10 := 1844674407370955161;
  59.   allUseParametr[isQInt].maxLen := 20;
  60.   allUseParametr[isQInt].maxNumeric := 9223372036854775807;
  61.   allUseParametr[isQInt].maxNumDiv10 := 922337203685477580;
  62.   {$IfEnd}
  63. end;                        
  64.  
  65. // проверки на печатаемые символы не производится
  66. function geStrToInt(Str: String; Size: LongWord = isInteger): Boolean;
  67. var
  68.   lenStr, i: LongWord;
  69.   m, n, z: QWord;
  70. begin
  71.   Result := False;
  72.   if (Size < 4) or (Size > 7) then
  73.     Exit;
  74.   // обнуляем, и флаг изначально указан, что не действителен
  75.   resInt64 := 0;
  76.   IntMinus := False;
  77.   lenStr := Length(Str);
  78.   if lenStr = 0 then
  79.     exit;
  80.   i := 1;
  81.   m := Byte(Str[1]);
  82.   if m = 45 then
  83.   begin
  84.     if lenStr = 1 then
  85.       exit;
  86.     IntMinus := True;
  87.     inc(i);
  88.     m := Byte(Str[2]);
  89. //      dec(lenStr);
  90.   end;
  91.   inc(i);
  92.   m := m - 48;
  93.   // проверяем на установленную длину. Но сначала проверим на знак.
  94.   if lenStr > allUseParametr[Size].maxLen then
  95.     Exit;
  96.   while i < lenStr do
  97.   begin
  98.     m := m * 10 + (Byte(Str[i]) - 48);
  99.     inc(i);
  100.   end;
  101.   // Если уже превысили, то выходим
  102.   if m > allUseParametr[Size].maxNumDiv10 then
  103.     exit;
  104.   m := m * 10;
  105.   z := Byte(Str[i]) - 48;
  106.  
  107.   // обработку размерностей и Word и Int надо разделить
  108.   if IntMinus then
  109.     n := allUseParametr[Size].maxNumeric + 1 - m
  110.   else
  111.     n := allUseParametr[Size].maxNumeric - m;
  112.   if z > n then
  113.     exit;
  114.  
  115.   if IntMinus then
  116.     resInt64 := - m - z
  117.   else
  118.     resInt64 := m + z;
  119.   Result := true;
  120. end;              
  121.  

ускорение достаточное для платформы x86 и по моему слишком большое для ARM (у меня изменения скорости показывало на Android в 3-12 раз!).
Следует учесть, что ни каких исключений в коде не произойдёт. Я от них попросту избавился. А StrToInt-FPC может вызвать исключения. В данном коде если число не будет переведено, то мы об этом узнаем по результату функции Boolean. И можем считать данное число из resInt64, если результат был вычислен. Использовать можно для любых платформ.

Ещё раз повторюсь, я не говорю, что команда по разработке FPC не работает! Я говорю, что мелочами занимаются достаточно мало. И я это  понимаю, потому что мелочи - это неблагодарная работа, на которую надо много времени, а результат не всегда хороший.

Google translate:
acceleration is sufficient for the x86 platform and, in my opinion, is too large for ARM (my speed changes showed 3-12 times on Android!).
Please note that no exceptions will occur in the code. I just got rid of them. StrToInt-FPC can throw exceptions. In this code, if the number is not translated, then we will find out about it by the result of the Boolean function. And we can read the given number from resInt64, if the result was calculated. Can be used for any platform.

Again, I'm not saying that the FPC development team isn't working! I say that little things are done very little. And I understand this, because little things are a thankless job that takes a lot of time, and the result is not always good.

P.S. Это только у меня StrToInt не хочет работать на Android? Мне приходится использовать Val напрямую.
P.S. Is it just me StrToInt doesn't want to work on Android? I have to use Val directly.
Title: Re: How optimized is the FPC compiler
Post by: SymbolicFrank on October 31, 2021, 10:44:17 am
With only 7 or 8 significant digits, Singles should be processed as Doubles, to prevent unneeded precision loss. Extended is a bit of a strange format, because it's only used by floating point hardware that wants to comply with the IEEE 754 standard, which recommends using an internal, 80 bit format to prevent that precision loss. It's not meant to be used directly in code.

Seen from that perspective, it makes a lot of sense to only use Doubles for all floating-point arithmetic, with the exception of things like currency, which are actually fixed-point integers. And in all cases, floating point hardware probably expands it to those 80 bits before processing.

Even more so, only SIMD/vector units tend to use other formats: they use larger execution units, like 256 bits wide, but can also split those into multiple, smaller words. That should only be used for speed, not for precision. If you don't mind the lowest bits containing nonsense, depending on the (amount of) operations performed.
Title: Re: How optimized is the FPC compiler
Post by: marcov on October 31, 2021, 11:49:14 am
function geStrToInt(Str: String; Size: LongWord = isInteger): Boolean;

Const the string?
 
Title: Re: How optimized is the FPC compiler
Post by: Seenkao on October 31, 2021, 12:02:14 pm
Не совсем понял вопроса, но как я понял по поводу "Str". Тут вероятно неправильно просто указал, можно изменить допустим на "numStr".
Если вопрос по тому что в функции не указано, что "Str"-константа. Тут я не могу сказать достаточно точно. Но не желательно вообще какого-то изменения в строке, это вызовет накладные расходы самим компилятором (если это не отключено по умолчанию).

Google translate:
I didn't quite understand the question, but as I understood about "Str". It probably just indicated it incorrectly, you can change it to "numStr".
If the question is because the function does not specify that "Str" is a constant. Here I cannot say precisely enough. But it is not desirable to change any line at all, it will cause overhead by the compiler itself (unless it is disabled by default). And it is not necessary.
Title: Re: How optimized is the FPC compiler
Post by: ASerge on October 31, 2021, 12:22:19 pm
I didn't quite understand the question, but as I understood about "Str".
It is more efficient to preface parameters of managed types with a const modifier.
Title: Re: How optimized is the FPC compiler
Post by: SymbolicFrank on October 31, 2021, 12:36:21 pm
I just checked: on an Armv8, if you have a float unit, it is IEEE 754 compliant as well (which implies 80 bit calculations) and the vector unit seems to be 128 bits wide, but only supports up to 64 bit operations. Interestingly enough, both support half-floats (16 bits, 3-4 significant figures). Perhaps for crude graphics or neural networks? Memory is at a premium on a microcontroller.

Quote from: Armv8-A and Armv8-R Architectures
The Armv8 architecture supports single-precision (32-bit) and double-precision (64-bit) floating-point data types and arithmetic as defined by the IEEE 754 floating-point standard. It also supports the half-precision (16-bit) floating-point data type for data storage, by supporting conversions between single-precision and half-precision data types and double-precision and half-precision data types. When Armv8.2-FP16 is implemented, it also supports the half-precision floating-point data type for data-processing operations.

That leaves the question: do you want maximum speed on hardware without floating point support, even if that means that the results of your application will differ depending on the platform used? And if so, how easy should it be to upload libraries to the FPC repository that implement one such subset of calculations in that specific assembly language?
Title: Re: How optimized is the FPC compiler
Post by: Seenkao on November 01, 2021, 02:29:01 am
SymbolicFrank, я не думаю, что перевести строковую константу в числа с плавающей точкой сложно, даже учитывая все нюансы. Но, вероятно здесь мы в самом деле должны учитывать какой длины будут данные числа (80, 64, 32 или 16 бит).
Я этим пока не занимался. Буду или нет, пока не знаю. Я и так достаточно немало времени убил на разную мелочь. А надо ещё подучить ассемблер ARM/ARM64, чтоб лучше понимать, что можно улучшить, а что нет.

Google yranslate:
SymbolicFrank, I don't think it is difficult to translate a string constant into floating point numbers, even taking into account all the nuances. But, probably here we really have to take into account how long the given numbers will be (80, 64, 32 or 16 bits).
I haven't done this yet. Whether I will or not, I don’t know yet. I've already killed quite a lot of time on various trifles. And I also need to learn ARM/ARM64 assembler in order to better understand what can be improved and what cannot.
Title: Re: How optimized is the FPC compiler
Post by: munair on December 31, 2021, 12:39:34 pm
I generally notice the executables compiled by Free Pascal are over twice as big and run about half the speed compared with the same code compiled by Delphi7.

Just like more modern Delphi.  D7 is minimalist and has no deep support for unicode, or anything else after 2003 or so.  If you need to compare to a Delphi, use a recent one, not something ancient.

Also, binary size minimization is no core target at the moment, nobody wants to do complex work on it, like improving smartlinking (except Pascaldragon, occasionally)
Not to mention that D7 targeted Windows only. Big difference.
Title: Re: How optimized is the FPC compiler
Post by: munair on December 31, 2021, 12:45:49 pm
Did anyone notice the post dates jumped back after reply #110?
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on December 31, 2021, 12:51:25 pm
Did anyone notice the post dates jumped back after reply #110?
Does it?
110: 2020-Dec-26
111: 2021-Sep-21
112: 2021-Sep-23

Title: Re: How optimized is the FPC compiler
Post by: munair on December 31, 2021, 12:55:20 pm
In any case, I don't really understand the comparison between languages optimization-wise. It's like asking who has the fastest car. Maybe in the old days when resources were limited, optimization could make a big difference. But today, the more relevant question is what language does a programmer prefer for specific targets, and there are a lot more considerations than "which language is fastest". As far as C-like languages are concerned, despite the never ending comparison discussions, there is a lot that these languages have against them. I generally find these discussions completely pointless.
Title: Re: How optimized is the FPC compiler
Post by: munair on December 31, 2021, 12:56:07 pm
Did anyone notice the post dates jumped back after reply #110?
Does it?
110: 2020-Dec-26
111: 2021-Sep-21
112: 2021-Sep-23
LOL, you're right.
Title: Re: How optimized is the FPC compiler
Post by: Akira1364 on January 02, 2022, 07:15:19 am
This is an old thread, but, however: while the compiler's codegen is pretty good overall, the RTL / packages are (sadly) chock full of code written seemingly without any kind of understanding whatsoever of how certain keywords actually interact with optimization. Like, if you pass something such as a large-ish record or reference-counted string by value in FPC, the compiler will generate horrendous code 100% of the time, period, end of story. You have to use `const` or `constref` or `var` depending on the context. It's not optional, particularly if you're writing something intended for use by a large number of people.

Furthermore I don't even want to discuss the amount of time I've spent going into my local copy of the RTL / package sources and adding the `inline` modifier to one-liners that very clearly should should have been written with it to begin with, as it's just frustrating. Obviously it'd be nice if FPC had a more advanced form of the "AutoInline" switch turned on at all times so that you could just rely on it to do the right thing, but as it stands in reality it does the opposite (which is to say, no function without the `inline` modifer will ever be inlined, no matter what).

TLDR: great compiler, unfortunately shipped with overall too many slow libraries that constantly amount to "four to five completely un-inlined one-line function calls where each one just calls the next one, and the last one probably conditionally raises some silly exception with a resourcestring borrowed nearly verbatim from Delphi".

Not to say that there aren't decent alternatives: e.g. while TFPList (in my opinion) basically should have the `deprecated` modifier applied to it as a whole (since it's not that well optimized at all and goes against all notions of type safety by way of requiring mandatory "void pointer to literally anything" casts while using it), you can at the very least find some good stuff to replace it with both in the `FGL` unit and `Generics.Collections` unit that ship with FPC too. Or if you're willing to use third-party stuff, I highly recommend this library, which I think is basically unparalleled in quality as far as all data structures ever written in FPC go: https://github.com/avk959/LGenerics (https://github.com/avk959/LGenerics)
Title: Re: How optimized is the FPC compiler
Post by: 440bx on January 02, 2022, 08:16:28 am
<snip> no function without the `inline` modifer will ever be inlined, no matter what).
I just wanted to say that is a good thing. It makes the resulting code completely predictable which is often useful when debugging.

IOW, a function should never be inlined unless the "inline" modifier is present (and, if present, the compiler has to find it "acceptable")
Title: Re: How optimized is the FPC compiler
Post by: Akira1364 on January 02, 2022, 10:28:12 am
<snip> no function without the `inline` modifer will ever be inlined, no matter what).
I just wanted to say that is a good thing. It makes the resulting code completely predictable which is often useful when debugging.

IOW, a function should never be inlined unless the "inline" modifier is present (and, if present, the compiler has to find it "acceptable")


The current situation has far more downsides than upsides. A proper automatic implementation as is found in compilers for nearly every language I can think of (that could still be turned off per-function if necessary with stuff like the fairly new `noinline` modifier that does the opposite of `inline`, and also would probably just vary in aggressiveness to start with between -O1, -O2, -O3 etc) would be vastly better as the endgame solution overall, as then you wouldn't have to rely on various people who may or may not understand why it's actually important to appropriately set the 'inline' flags themselves in code you don't necessarily have direct control over.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on January 02, 2022, 11:33:10 am
Obviously it'd be nice if FPC had a more advanced form of the "AutoInline" switch turned on at all times so that you could just rely on it to do the right thing, but as it stands in reality it does the opposite (which is to say, no function without the `inline` modifer will ever be inlined, no matter what).

Have you tried?
Code: Text  [Select][+][-]
  1. -OoAUTOINLINE
or
Code: Text  [Select][+][-]
  1. {$OPTIMIZATION AUTOINLINE}
https://www.freepascal.org/docs-html/prog/progsu58.html

It's not on by default, but you can add it to your fpc.cfg.
Title: Re: How optimized is the FPC compiler
Post by: Akira1364 on January 02, 2022, 11:51:54 am
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.
Title: Re: How optimized is the FPC compiler
Post by: Martin_fr on January 02, 2022, 01:12:24 pm
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.
Well yes.

(Afaik) Inlining, does not just copy the already generated assemble code into the new location (replacing the call).

Inlining, (re-)compiles (partly) the code (from the called proc) in the new location. (Afaik it copies the "nodes" (representing the parsed, and maybe partly processed/compiled code) into the new place. From that the final assembler is generated).

But that is only possible, if those nodes exist. And they only do, if a proc was compiled with the indent to be inlined.


If for example some commercially sold code is released as ppu/o only, then they may not want to distribute such extra info (which could be used to gain insight into their unpublished code).
Title: Re: How optimized is the FPC compiler
Post by: PascalDragon on January 03, 2022, 02:08:28 pm
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.

Of course it does not, because it needs to have the node tree available in the PPU which is only the case if the routine has the inline directive or was compiled while the AutoInline optimization was enabled.
Title: Re: How optimized is the FPC compiler
Post by: Akira1364 on January 09, 2022, 02:17:47 am
Yeah, that's what I was referring to in my comment. It works ok-ish as-is, but does not have any impact on pre-compiled units of course.

Of course it does not, because it needs to have the node tree available in the PPU which is only the case if the routine has the inline directive or was compiled while the AutoInline optimization was enabled.

Right, yeah, I'm aware of why it doesn't work that way currently. Again, my original comment was envisioning a world where FPC just auto-inlined appropriately all the time when compiling code (as most compilers do, at least nowadays) regardless of the presence of any particular modeswitch.

If for example some commercially sold code is released as ppu/o only, then they may not want to distribute such extra info (which could be used to gain insight into their unpublished code).

I think if that sort of thing was actually a realistic concern in conjunction with inlining specifically, it would be a known issue to at least some extent in other languages. It's not though, at least as far as I've ever heard.
Title: Re: How optimized is the FPC compiler
Post by: abouchez on February 04, 2022, 02:20:43 pm
After decades working with Delphi, and now FPC as my main compiler, I don't think FPC is much slower. It is actually faster, in some areas.
Especially since FPC 3.2 code generation was much better. Much less bloated asm as it did with previous versions.

I agree that the RTL is sometimes non-optimal (missing "const" or "inline").
But its purpose was to be cross-platform and maintainable.
This is why with my mORMot I rewrote most of the RTL, and bypassed it, writing some optimized asm for the most critical part.
https://github.com/synopse/mORMot2

Honestly, the server side is where performance matters - especially multi-threaded performance.
For LCL or client applications, compiler quality is less essential. You can have reactive apps written in python. :)

When I run the mORMot 2 regression tests, I have a huge set of realistic benchmarks, which cover a lot of aspects, like data structures, JSON, cryptography, http/websockets client and server, database, multi-threading....
Then we can compare between compilers and systems.
The fastest is with FPC and Linux on x86_64, with our own x86_64 memory manager written in asm. https://github.com/synopse/mORMot2/blob/master/src/core/mormot.core.fpcx64mm.pas is worth a try and gives a noticeable performance boost on server processing and memory usage, especially on multi-threading process.

From those regression tests, Delphi is behind, especially due to the Windows overhead, and also to more tuned pascal code targeting FPC.
Delphi outside Win32/Win64 is simply not optimized - even if they use LLVM as codegen backend, LLVM is not leveraged and resulting asm is not very good.
So I stick with FPC as my main target for optimization. And even if AARCH64 codegen is not optimal, it is pretty stable and we can have good numbers with a few bit of asm and statically linked library - see https://blog.synopse.info/?post/2021/08/17/mORMot-2-on-Ampere-AARM64-CPU

Of course, this is only an opinion.
And I like FPC so much that I am officially biased.  O:-) But I have no wish to come back to Delphi. Lazarus is so more light and stable for daily use - and with fpdebug starts to be good enough for debugging. But as a compiler, FPC is amazing and do a very good job.
Open Source rocks!
TinyPortal © 2005-2018