Recent

Author Topic: Default and speed effect  (Read 1497 times)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9209
  • Debugger - SynEdit - and more
    • wiki
Re: Default and speed effect
« Reply #15 on: May 31, 2023, 04:43:17 pm »
At this point you should tell us
- your fpc version
- exact settings to compile with

Not sure what you mean with "Initialize", but I can see the "FPC_COPY"


In both cases, fpc creates a temporary variable. This is needed, because for managed types (dyn array), the variable "result" is a "var param". That is "result" is passed as parameter.
Because the actual variable MUST NOT be changed until the function returns (otherwise, in many other case you will get wrong data), a temp variable is passed to the function.
When the function returns the temp var is assigned to the real var. (In this case using FPC_COPY ).


Now I don't know fpc internals, but I can see this.
WITHOUT "A := Default(TTest);" the variable "A" is ONLY accessed in the loop => the value is NOT used anywhere else.
It *appears* fpc is actually noting that. Because FPC knows that the value in "A" is never used, it does not copy the result to A.

Smart decision.  Screws your benchmark though ;)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9209
  • Debugger - SynEdit - and more
    • wiki
Re: Default and speed effect
« Reply #16 on: May 31, 2023, 04:45:25 pm »
If there are no references to the variable, the compiler optimizes and does not create a temporary variable, avoiding calling the fpc_copy_proc.

Actually here it does create a temp in the case of "not accessed elsewhere". Maybe different depending on fpc version and settings.

Okoba

  • Sr. Member
  • ****
  • Posts: 435
Re: Default and speed effect
« Reply #17 on: May 31, 2023, 06:33:48 pm »
I noted the exact version on the first post. And it is ran on the Release mode.
By "Initialize" I meant the assembly code preparing a clean value (It was similar to the assembly of A := Default(TTest) line).

Oh so FPC knows the value is not used so it reuses the value. Smart. I wish this feature had a name so I can comment or made a feature request about it. As the more smart way was for it too look at "outside of loop". I reported such issue before at https://gitlab.com/freepascal.org/fpc/source/-/issues/39725. Maybe they are the same thing. What do you think?
What is your version?



Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9209
  • Debugger - SynEdit - and more
    • wiki
Re: Default and speed effect
« Reply #18 on: May 31, 2023, 06:51:41 pm »
I noted the exact version on the first post. And it is ran on the Release mode.
My bad, sorry.

Quote
Oh so FPC knows the value is not used so it reuses the value. Smart. I wish this feature had a name so I can comment or made a feature request about it. As the more smart way was for it too look at "outside of loop". I reported such issue before at https://gitlab.com/freepascal.org/fpc/source/-/issues/39725. Maybe they are the same thing. What do you think?
What is your version?

There probably is a name... "variable lifetime analysis", but there is some other term, I just can't recall. Some term describing the way code (and the data belonging to it) is analysed to be able to "rewrite" the order of operation for optimization.... Something that afaik is "work in progress" in FPC. (and has been for a long time).  But I may be wrong. I am not part of the fpc team.



Quote
Maybe they are the same thing.

very hard to answer question. May even depend on how broad you make "the same".
E.g. "Yes they are both optimizations", but obviously not what you meant.

But I don't think they are a total overlap. Both need information about the "usage" of a variable. But apparat from that, the inlining seems different.

I don't know, but I think I heard that FPC generates some form of pre-compiled code for "inlining".  (You can inline, even if you have only a ppu, and no source code)
=> That would mean this code wants a pointer for a "var param", and then it can't be used in a register.

To actually use a register (the whole time, including in the inlined code), the compiler would have to rewrite the code for the function at the time when it is inlining it. And that is totally different from just determining, how a variable is used during its lifetime.

Okoba

  • Sr. Member
  • ****
  • Posts: 435
Re: Default and speed effect
« Reply #19 on: June 01, 2023, 11:13:49 am »
Thank you very much for the notes.
I added an issue and a comment to ask for FPC team opinion: https://gitlab.com/freepascal.org/fpc/source/-/issues/40303

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9209
  • Debugger - SynEdit - and more
    • wiki
Re: Default and speed effect
« Reply #20 on: June 01, 2023, 12:37:23 pm »
Well, your real issue is obviously that you are using that code, and that you believe it slows down your app. Question is still, how much it really affects your final app. (see last part of this post)

In any case, in your app, you will need the result to be used, therefore you will not have an otherwise unused local var as target (and non local vars are afaik always treated like being used / but there is a whole lot more to it).

So in your real live, that final assignment (temp to your var) has to happen. And there is no easy way round it. And since it is a managed var, it is taking extra time. (managed vars need ref count checking, and that is done (to an extend) thread save).

Even if code is inlined, those temp var effects may still happen (probably will). And if you are using a non-local var (class field, global) , then they may not even be optimizable / because that would change the behaviour of the app => not yours maybe, but in general, and outside the scope of a compiler being able to know).




If you must have a managed var, then the
Code: Pascal  [Select][+][-]
  1. //a,b: array of integer;
  2.   b := a;
  3.   setlength(b, length(b));
may be more efficient.
But you can't move it into a subroutine => then you will have temp vars again

Test case:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. var  a,b: array of integer;
  3. begin
  4.   a := nil;
  5.   SetLength(a, 1000);
  6.   b := a;
  7.   writeln(ptruint(pointer(a)));
  8.   writeln(ptruint(pointer(b))); // pointing same address
  9.  
  10.   setlength(b, length(b));
  11.   writeln(ptruint(pointer(a)));
  12.   writeln(ptruint(pointer(b))); // pointing diff address
  13.   readln;
  14. end.

Well, maybe you can use a function, if you pass the arguments as "var param"
Code: Pascal  [Select][+][-]
  1. procedure CopyData(const Source: TFoo; var Dest: TFoo);
Because then you don't get any ref count interference when calling the procedure. You do get a pointer, but that should have less impact. And you loose having the data in a register (if ever it was), which again can have impact.

Though not sure if "SetLength" is register friendly. Or for that matter, any of the internally called function when managed types are involved.




The only faster bit, then would likely be to use pointers instead of managed types. (I assume you don't use threads)
Code: Pascal  [Select][+][-]
  1. var TData: ^Integer.
And use GetMem or AllocMem. Copy the date yourself.
Use pointermath {$T+} to access the data "AData[n]" (like an array)





But consider first, if it is really worth it. How much time are you going to save in your entire app?

Is your app currently doing a loop with 1,000,000 or more such assignments? Is it stack like half a minute on that loop?
Even then, you still need to copy the memory, so it still will take time.

If your app is not stuck for such long on that loop, then likely the saving will not make a major difference to the overall speed of your app.



If you are about to optimize your app, then I recommend "valgrind --tool=callgrind" and kcachegrind.
Though that requires you to be able to run the app on Linux (a VM will do).

You get a nice chart where time was spent, and how much of the time.


Okoba

  • Sr. Member
  • ****
  • Posts: 435
Re: Default and speed effect
« Reply #21 on: June 01, 2023, 01:03:47 pm »
Very helpful, as always.
The Assign procedure is like what you said for CopyData. I wanted to use the operator to make life easier.
I should check using pointers too, and I sometimes do. My main concern was pointing out a potential issue and learning "why this weird thing happens", and you made it much more clear for me.

About optimization, this code is just a part of my app, and sure, if I optimize it, it will not even speed up my program by 10%, but it will speed up that task by an order of magnitude and take less progress time to show the user. The problem is that I need to benchmark each part of my project to find any mistakes, like cleaning up every part of an engine before assembling it. And that code was only a sample, I wanted to know why Default() effects my code, why an operator is making things slower, and whether I should prevent using them. But to answer your question, in my program, it is a loop going through records in the database.

Very interesting point about valgrind. So you use Linux or a VM, and configure valgrind with Lazarus to have a better look? I like this idea.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9209
  • Debugger - SynEdit - and more
    • wiki
Re: Default and speed effect
« Reply #22 on: June 01, 2023, 01:18:20 pm »
As you may have learned from the video: An unrelated change => such as changing the order in which 2 procedures are declared, may add a 2 digit percentage speed change. (up or down).

In your case, it wasn't some side effect on the CPU, but just that some simplified samples can be optimized better.

I do recommend valgrind. Really. It is worth it. (You get the accumulated time of all calls to a method).


One other maybe interesting thing. Though I have last tested that years ago.

Because smaller code snippets can sometimes be better optimized (including register allocation, at least back then), I have seen that splitting a method into 2 procedures can (or "was once able to") gain speed.

This was in 2008 / So a very old compiler now. / May be different today
https://gitlab.com/freepascal.org/fpc/source/-/issues/10275#note_644215372

Okoba

  • Sr. Member
  • ****
  • Posts: 435
Re: Default and speed effect
« Reply #23 on: June 01, 2023, 01:39:38 pm »
You are right about changing the order. In such low level codes, any minor change may end up opening a new possibility, as these functions would be called billions of times.
On your note, yes, of course, splitting different bodies of code is a very good habit. It gives the compiler more registers to work with, in addition to having more maintainable code.
In that note, I think it is an old sample, in my current working habits, I would split those functions, but not as subfunctions (FPC has a problem optimizing those), and I would inline them, to let FPC know that they are using the same variables. Generally, it led to more speed, as I will prevent the cost of a function call. But if those functions were totally different with different variables, they should not be inlined, as it will limit the compiler's available register count. All these thoughts are my feelings about how the compiler works, not actual facts, as I know nothing about its internals.

I will make a VM and test the valgrind way.

 

TinyPortal © 2005-2018