Recent

Author Topic: Procedure optimization problem with local variable  (Read 4640 times)

Okoba

  • Hero Member
  • *****
  • Posts: 533
Procedure optimization problem with local variable
« on: June 28, 2020, 12:16:47 am »
Can anyone point me to why these two loops have different times? They should be the same. Is there a optimization missed?
It is the same in Delphi but not C.

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. uses
  4.   SysUtils;
  5.  
  6. type
  7.   TTest = record
  8.     P: int64;
  9.   end;
  10.  
  11.   procedure Test;
  12.   var
  13.     V: TTest;
  14.     P: int64;
  15.     T: UInt64;
  16.     i, C: integer;
  17.   begin
  18.     C := 1000 * 1000 * 1000;
  19.  
  20.     T := GetTickCount64;
  21.     P := 1;
  22.     for i := 1 to C do
  23.       Inc(P);
  24.     WriteLn(GetTickCount64 - T); //250
  25.  
  26.     T := GetTickCount64;
  27.     V.P := 1;
  28.     for i := 1 to C do
  29.       Inc(V.P);
  30.     WriteLn(GetTickCount64 - T);
  31.  
  32.     T := GetTickCount64;
  33.     V.P := 1;
  34.     P := V.P;
  35.     for i := 1 to C do
  36.       Inc(P);
  37.     P := V.P;
  38.     WriteLn(GetTickCount64 - T); //1400
  39.   end;
  40.  
  41. begin
  42.   Test;
  43.   ReadLn;
  44. end.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9869
  • Debugger - SynEdit - and more
    • wiki
Re: Procedure optimization problem with local variable
« Reply #1 on: June 28, 2020, 12:53:05 am »
Assuming -O4?
And assuming the bigger time is the loop in the middle? not the last one?

Use -al to see assembler.

Fpc (at least 3.0.4) optimizes the first loop, by using a register for "P".

But the 2nd loop, it does not optimize. I guess its because its a record. V.P remains in memory. So it is slower.

440bx

  • Hero Member
  • *****
  • Posts: 4030
Re: Procedure optimization problem with local variable
« Reply #2 on: June 28, 2020, 01:03:25 am »
I was going to post exactly what Martin_fr said above including his corrections about the loop timings you presented.


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Josh

  • Hero Member
  • *****
  • Posts: 1274
Re: Procedure optimization problem with local variable
« Reply #3 on: June 28, 2020, 01:03:47 am »
What is odd though,
Tested on latest trunk64 on windows, with 32but cross compiler.

if you compile for 32bit the values are very close,
but compiling for win64 the values are way off.



The best way to get accurate information on the forum is to post something wrong and wait for corrections.

440bx

  • Hero Member
  • *****
  • Posts: 4030
Re: Procedure optimization problem with local variable
« Reply #4 on: June 28, 2020, 01:07:21 am »
if you compile for 32bit the values are very close,
but compiling for win64 the values are way off.
The reason for that is, in 64bit the int64 type fits in a register, therefore the variable can be placed in a register whereas in 32bit it cannot (too big).  Therefore in 32bit the compiler is somewhat forced into a more "pedestrian" way of incrementing the variable.

IOW, in 64bit there is a big difference between incrementing a register or incrementing the value in a memory location.  In 32bit, it is always incrementing the value at a memory location which causes the measurements to always be within the margin of error.


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Okoba

  • Hero Member
  • *****
  • Posts: 533
Re: Procedure optimization problem with local variable
« Reply #5 on: June 28, 2020, 01:20:24 am »
@Martin_fr and @440bx Yes the bigger time is for the record and yes optimization is o4 and 64bit and FPC trunk.
So it can be an optimization like it is with 32bit or in the C compiler (Clang)?

@josh interesting point, thanks for the input.

Updated code:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. uses
  4.   SysUtils;
  5.  
  6. type
  7.   TTest = record
  8.     P: int64;
  9.   end;
  10.  
  11.   procedure Test;
  12.   var
  13.     V: TTest;
  14.     P: int64;
  15.     T: UInt64;
  16.     i, C: integer;
  17.   begin
  18.     C := 1000 * 1000 * 1000;
  19.  
  20.     T := GetTickCount64;
  21.     P := 1;
  22.     for i := 1 to C do
  23.       Inc(P);
  24.     WriteLn(GetTickCount64 - T); //266
  25.  
  26.     T := GetTickCount64;
  27.     V.P := 1;
  28.     for i := 1 to C do
  29.       Inc(V.P);
  30.     WriteLn(GetTickCount64 - T); //1400
  31.  
  32.     T := GetTickCount64;
  33.     V.P := 1;
  34.     P := V.P;
  35.     for i := 1 to C do
  36.       Inc(P);
  37.     P := V.P;
  38.     WriteLn(GetTickCount64 - T);  //250
  39.   end;
  40.  
  41. begin
  42.   Test;
  43.   ReadLn;
  44. end.

Okoba

  • Hero Member
  • *****
  • Posts: 533
Re: Procedure optimization problem with local variable
« Reply #6 on: June 28, 2020, 03:01:59 am »
@Martin_fr I tried to export the assembly but couldn't find a solution.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9869
  • Debugger - SynEdit - and more
    • wiki
Re: Procedure optimization problem with local variable
« Reply #7 on: June 28, 2020, 04:05:16 am »
@Martin_fr I tried to export the assembly but couldn't find a solution.

The asm is only to see what happens.

Afaik there is no solution. Other than hoping that fpc 3.4 (in 2 or 3 years?) will support this. (Or using trunk, when and if this gets implemented)
From what I know (but I am not one of the fpc core team, so this is 2nd hand knowledge) the fact that there is a record, blocks the register optimizer. (despite the value would fit, in 64bit)

NOT tested, but an idea that you could try
Code: Pascal  [Select][+][-]
  1. var
  2.   Accessor: int64; absolute V.P;
  3.  
May need some syntax fixes....
Maybe that way the compiler can ignore the record, and use a register.... Maybe.


Btw:
Quote
So it can be an optimization like it is with 32bit
The 32 bit compilation did not optimize any loop (from what I read).


Okoba

  • Hero Member
  • *****
  • Posts: 533
Re: Procedure optimization problem with local variable
« Reply #8 on: June 28, 2020, 04:32:42 am »
I tested the absolute way and not helped. Also I am using Trunk, so it is already a reported bug or I should report it. How could I found out about that?

I tried to change the asm code for a better result, although I like to have more opinion on this, so I can solve such a problem with the help of asm for now, until fpc support this optimization.

440bx

  • Hero Member
  • *****
  • Posts: 4030
Re: Procedure optimization problem with local variable
« Reply #9 on: June 28, 2020, 04:46:36 am »
I tried to change the asm code for a better result, although I like to have more opinion on this, so I can solve such a problem with the help of asm for now, until fpc support this optimization.
You already found a reasonably good solution which is the last test case in your test program.  Just assign the record field to a variable, use that variable (which the compiler will place in a register) and when the loop is done, move the value of the variable back into the record's field.    It's a bit "pedestrian" but, I believe it is preferable over using assembler.

Placing a comment stating the reason for the "acrobatics" with the record's field might be a good idea if someone other than yourself may need to maintain that code.

HTH.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 2242
Re: Procedure optimization problem with local variable
« Reply #10 on: June 28, 2020, 05:51:02 am »
And you need to optimize small procedures. For example, let's add this procedure to our sample:
Code: Pascal  [Select][+][-]
  1. procedure Dummy(var Value: Int64); inline;
  2. begin
  3. end;
It's even inline! However, if you add a Dummy(P) to the end of the Test procedure, the compiler will stop putting the variable P in the register, even in version 3.3.1 with the optimization level -O4.

Okoba

  • Hero Member
  • *****
  • Posts: 533
Re: Procedure optimization problem with local variable
« Reply #11 on: June 28, 2020, 12:10:47 pm »
@440bx thanks for the suggestion, although I like to find a cleaner way as this loop will be used a lot.
@ASerge I'm afraid I didn't understand exactly your point.

ASerge

  • Hero Member
  • *****
  • Posts: 2242
Re: Procedure optimization problem with local variable
« Reply #12 on: June 28, 2020, 04:21:26 pm »
@ASerge I'm afraid I didn't understand exactly your point.
I took your last example. I got results:
Quote
343
2496
327
Then I added the procedure I had indicated to the end. I got results:
Quote
2496
2480
2325

PascalDragon

  • Hero Member
  • *****
  • Posts: 5481
  • Compiler Developer
Re: Procedure optimization problem with local variable
« Reply #13 on: June 28, 2020, 04:22:50 pm »
And you need to optimize small procedures. For example, let's add this procedure to our sample:
Code: Pascal  [Select][+][-]
  1. procedure Dummy(var Value: Int64); inline;
  2. begin
  3. end;
It's even inline! However, if you add a Dummy(P) to the end of the Test procedure, the compiler will stop putting the variable P in the register, even in version 3.3.1 with the optimization level -O4.

For passing the Value parameter the compiler needs a memory location. Apparantly it doesn't recognize correctly that it doesn't need to handle it as a memory value. Would you please report this as a bug with a selfcontained example?

ASerge

  • Hero Member
  • *****
  • Posts: 2242
Re: Procedure optimization problem with local variable
« Reply #14 on: June 28, 2020, 04:26:16 pm »
Would you please report this as a bug with a selfcontained example?
In my opinion, this is not a bug. You can't force the compiler to optimize everywhere and always, something must remain for the developer.

 

TinyPortal © 2005-2018