Lazarus

Programming => General => Topic started by: Paolo on March 08, 2025, 04:30:22 pm

Title: nested for loop efficiency
Post by: Paolo on March 08, 2025, 04:30:22 pm
Hello,

For cycle of such
Code: Pascal  [Select][+][-]
  1.  
  2. For i:=F1 to F2 do begin
  3. .
  4. End;
  5.  

I know that F1 and F2 are evaluated one time, before foor loop started.

But I want to know about code like below if it is the same or not for F3, F4, F5 and F6 (where returned values of functions “Fn” are large).

Code: Pascal  [Select][+][-]
  1. For i:=F1 to F2 do begin
  2.   For j:=F3 to F4 do begin
  3.     For k:=F5 to F6 do begin
  4.     …
  5.     End;
  6.   End;
  7. End;
  8.  

Is the compiler optimising such evaluation in case of F3 and F4 are not function of “i” ? and so for F5 and F6 are not function of “i” or “j” ?

As another example:

Code: Pascal  [Select][+][-]
  1. For i:=0 to List1.Count-1 do begin
  2.   For j:=0 to List2.Count-1 do begin
  3.     For k:=0 to List3.Count-1 do begin
  4.     …
  5.     End;
  6.   End;
  7. End;
  8.  

Is the call to List3.Count executed List1.Count*List2.Count times ? if yes it should be reccomdable to write

Code: Pascal  [Select][+][-]
  1. ...
  2. cj:= List2.Count-1;
  3. ck:= List3.Count-1;
  4. For i:=0 to List1.Count-1 do begin
  5.   For j:=0 to cj do begin
  6.     For k:=0 to ck do begin
  7.     …
  8.     End;
  9.   End;
  10. End;
  11.  

thanks.
Title: Re: nested for loop efficiency
Post by: alpine on March 08, 2025, 05:15:47 pm
Is the compiler optimising such evaluation in case of F3 and F4 are not function of “i” ? and so for F5 and F6 are not function of “i” or “j” ?
Аnd how is the compiler supposed to know this?

Is the call to List3.Count executed List1.Count*List2.Count times ?
I would guess that the answer is yes.

Code: Pascal  [Select][+][-]
  1. ...
  2. cj:= List2.Count-1;
  3. ck:= List3.Count-1;
  4. For i:=0 to List1.Count-1 do begin
  5.   For j:=0 to cj do begin
  6.     For k:=0 to ck do begin
  7.     …
  8.     End;
  9.   End;
  10. End;
  11.  

thanks.

at least in the latter code you don't have to assume anything.
Title: Re: nested for loop efficiency
Post by: PascalDragon on March 08, 2025, 05:19:18 pm
In general the compiler will evaluate the expression the nested loops each time the loops are entered. So your last suggestion would be the best assuming the loops don't change the amount of elements in the list.

However depending on the optimization level (like enabling both DFA and Constant Propagation) the compiler might be able to optimize this by itself depending on the expressions involved (e.g. simple expressions without side effects like function/method calls or if the compiler is able to inline the function/method in question) though this is still subject to improvement...

Is the compiler optimising such evaluation in case of F3 and F4 are not function of “i” ? and so for F5 and F6 are not function of “i” or “j” ?
Аnd how is the compiler supposed to know this?

Using Data Flow Analysis it indeed can determine this. However it's not enabled by default, because its a rather expensive operation.
Title: Re: nested for loop efficiency
Post by: Paolo on March 08, 2025, 05:33:27 pm
thanks,

and what is the -Olevel that implement that functionality ?
Title: Re: nested for loop efficiency
Post by: PascalDragon on March 08, 2025, 05:43:35 pm
It's -O3. You can also enable them seperately using -Oodfa and -Ooconstprop. Though as said it will highly depend on the code in question and the compiler version.
Title: Re: nested for loop efficiency
Post by: Paolo on March 08, 2025, 06:12:27 pm
@Alpine, my question was for cases like

Code: Pascal  [Select][+][-]
  1. For i:=0 to List1.Count-1 do begin
  2.   For j:=0 to List2.Count-1 do begin
  3.     For k:=0 to List3.Count-1 do begin
  4.     …
  5.     End;
  6.   End;
  7. End;
  8.  

where maybe (just to simplify)

Code: Pascal  [Select][+][-]
  1.   property Count: integer read GetCount...
  2. ...
  3.   function GetCount: integer;
  4.   begin
  5.     Result:=FCount;
  6.   end;
  7.  

and maybe the compiler can resolve the nested "for" as constant "FCount" (avoiding huge calls of Count function), but leaving in the code the for loop written in more readable form.

or even a simpler case (where in the code vars are not touched in the inner loops)

Code: Pascal  [Select][+][-]
  1. varI, varJ, varK : integer;
  2. ...
  3. For i:=0 to varI-1 do begin
  4.   For j:=0 to varJ-1 do begin
  5.     For k:=0 to varK-1 do begin
  6.     …
  7.     End;
  8.   End;
  9. End;
  10.  
Title: Re: nested for loop efficiency
Post by: BrunoK on March 08, 2025, 06:34:15 pm
Will it be noticably faster ? What happens inside the   
Code: Pascal  [Select][+][-]
  1. For k:=0 to List3.Count-1 do begin
  2.     …
  3. End;
If you go for resolving the List.Count, at least presolve the varK-1 before its use.

Title: Re: nested for loop efficiency
Post by: alpine on March 08, 2025, 06:56:42 pm
@Paolo
Yes, I can guess what you're aiming for.

IMHO assembly level DFA is what you can count on, the HLL AST is much more complicated for practical analysis unless the language is adapted for that. I would expect FPC DFA to be effective in some locality, say the current scope or expression. The property accessors are a bit far away...

Anyway, why not just write it unambiguously?
Title: Re: nested for loop efficiency
Post by: Zvoni on April 01, 2025, 09:34:54 am
Resurrecting this:
If the bounderies of such nested loops are fixed (as discussed above), there might be an alternative to resolve the nested loops into a single one, and calculate the "indices"

sample for a 3D-Array (or 3 nested loops as above)
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. Var
  3.   i,c,j,k,m:Integer;
  4.   L1,L2,L3:Integer;
  5. begin
  6.   L1:=3;           //1st dimension
  7.   L2:=5;           //2nd dimension
  8.   L3:=8;           //3rd dimension
  9.   c:=L1*L2*L3;     //total count of iterations
  10.   For i:=0 To c-1 Do
  11.     Begin
  12.       j:=i div (L2*L3);          //Index of Dim1
  13.       k:=(i div L3) mod L2;      //Index of Dim2
  14.       m:=i mod L3;               //Index of Dim3
  15.       Writeln('j=',j,' - k=',k,' - m=',m);
  16.     end;
  17.   Readln;
  18. end.
No idea about performance impact
Title: Re: nested for loop efficiency
Post by: Thaddy on April 01, 2025, 10:23:56 am
I grew up with constant propagation by hand and that is sometimes more efficient than the current compiler can do. It is also a good exercise to see if you really understand your own code  :P ::)

To explain this in a simple way:
If a variable evaluation in an inner loop can be shown to remain a constant value inside that loop it is more efficient to bring that value to the next outer loop and evaluate it there as a constant value, do that for every nested loop from last inner to last outer.

The current compiler does a good job, but can err on complex functions that determine the loop value. That's where DFA comes in, but in my experience it is not yet mature in all cases. Hence doing it by hand.
TinyPortal © 2005-2018