Recent

Author Topic: Optimization of nested procedures and functions  (Read 488 times)

lagprogramming

  • Sr. Member
  • ****
  • Posts: 404
Optimization of nested procedures and functions
« on: July 16, 2022, 12:17:20 pm »
According to the documentation:
Quote
When a routine is declared within the scope of a procedure or function, it is said to be nested. In this case, an additional invisible parameter is passed to the nested routine. This additional parameter is the frame pointer address of the parent routine. This permits the nested routine to access the local variables and parameters of the calling routine.

In the situation when the nested routine doesn't access any parameter or local variable of the calling routine, would there be a problem if the produced code would be as if the routine was not nested? The frame pointer parameter occupies space with no benefit and it looks like it might block a useful CPU register(look at variable "f" in the following nested_yes and nested_no functions).

Here is an example where function1 and function2 do the same thing except for the fact that one of them calls a nested routine and the other one doesn't:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. function nested_no(a,b,c,d,e,f:sizeint):sizeint;
  4. begin
  5.   if a+b+c+d+e+f>0 then result:=-1 else result:=0;
  6. end;
  7.  
  8. function function1(a,b,c,d,e,f:sizeint):sizeint;
  9. begin
  10.   result:=nested_no(a,b,a,b,a,b);
  11. end;
  12.  
  13. function function2(a,b,c,d,e,f:sizeint):sizeint;
  14.   function nested_yes(a,b,c,d,e,f:sizeint):sizeint;
  15.   begin
  16.     if a+b+c+d+e+f>0 then result:=-1 else result:=0;
  17.   end;
  18. begin
  19.   result:=nested_yes(a,b,a,b,a,b);
  20. end;
  21.  
  22. begin
  23. writeln(function1(round(random),round(random),round(random),round(random),round(random),round(random)));
  24. writeln(function2(round(random),round(random),round(random),round(random),round(random),round(random)));
  25. end.


I attach FPC's produced code(look at variable "f" in the following nested_yes and nested_no functions)

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11351
  • FPC developer.
Re: Optimization of nested procedures and functions
« Reply #1 on: July 16, 2022, 12:56:35 pm »
If you mark nested_no as "inline", the assembler looks like

Code: [Select]
pushl %ebp
.Lc9:
.Lc10:
# Var a located in register eax
# Var b located in register edx
# Var c located in register ecx
# Var d located in stack [ebp+16]
# Var e located in stack [ebp+12]
# Var f located in stack [ebp+8]
.Ll6:
# [10] result:=nested_no(a,b,a,b,a,b);
leal (%eax,%edx),%ecx
addl %eax,%ecx
addl %edx,%ecx
addl %ecx,%eax
addl %edx,%eax
testl %eax,%eax
jng .Lj11
movl $-1,%eax
jmp .Lj12
.p2align 4,,10
.p2align 3
.Lj11:
xorl %eax,%eax
.Lj12:
# Var $result located in register eax
.Lc11:
.Lc12:
.Ll7:
# [11] end;
popl %ebp
ret $12


AND for function2:
.Ll10:
# [19] result:=nested_yes(a,b,a,b,a,b);
leal (%eax,%edx),%ecx
addl %eax,%ecx
addl %edx,%ecx
addl %ecx,%eax
addl %edx,%eax
testl %eax,%eax
jng .Lj18
movl $-1,%eax
jmp .Lj19
.p2align 4,,10
.p2align 3
.Lj18:
xorl %eax,%eax



Which doesn't call nested_no. So mark the nested functions you want to inline with the inline directive like this:

Code: Pascal  [Select][+][-]
  1. function nested_no(a,b,c,d,e,f:sizeint):sizeint; inline;
  2.  
  3. // and
  4.  
  5.  function nested_yes(a,b,c,d,e,f:sizeint):sizeint; inline;

Note that inlining very large functions does not always produce a better result. But

PascalDragon

  • Hero Member
  • *****
  • Posts: 5444
  • Compiler Developer
Re: Optimization of nested procedures and functions
« Reply #2 on: July 16, 2022, 04:34:09 pm »
In the situation when the nested routine doesn't access any parameter or local variable of the calling routine, would there be a problem if the produced code would be as if the routine was not nested? The frame pointer parameter occupies space with no benefit and it looks like it might block a useful CPU register(look at variable "f" in the following nested_yes and nested_no functions).

FPC main contains the UnusedPara optimization which removes the parent framepointer parameter in your example.

 

TinyPortal © 2005-2018