Recent

Author Topic: TTree depth-first traversal  (Read 2087 times)

440bx

  • Hero Member
  • *****
  • Posts: 1129
Re: TTree depth-first traversal
« Reply #15 on: September 22, 2019, 08:40:33 pm »
I have not revealed an absolute truth, but a typical situation:
I'm not sure if your post was a result of what I posted but, I agree.

in general recursive alghorithms are slightly slower than iterative ones, because they require as many 'calls with return' as the number of recursive cycles.
That's when one has to be very careful.  Some solutions can be implemented efficiently and elegantly using a recursive algorithm and, when such a recursive algorithm is restructured to avoid recursion, the implementation, as you pointed out in a recent post, tends to be more complicated.

Restructuring a simple, efficient and elegant recursive algorithm to eliminate recursion usually requires implementing a stack.  In such cases, all that is saved is the push of the return address needed in the recursive algorithm.  There is obviously a cost in pushing a return address to a memory location (the stack in this case) but, that operation isn't really very expensive. 

These types of operations are very expensive for the CPU.
There is obviously a cost but, pushing a return address on the stack isn't a very expensive operation but, in many cases, the added complexity resulting from converting a recursive algorithm into an iterative one can be considerable.  Personally, if the recursive algorithm performs well, I will forego the very small performance increase of an iterative algorithm because I don't think the small performance gain justifies the added complexity.

I mentioned authoritative sources in support of my thesis. If someone cites to me equally authoritative sources or shows me experimental evidence of the opposite sign, I am happy to change my mind, because I often use recursive algorithms. :D
Generally speaking, what you stated is true.  The keyword being "generally".

« Last Edit: September 22, 2019, 08:42:14 pm by 440bx »
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #16 on: September 22, 2019, 08:49:20 pm »
I wrote:

... I note only that it would have the two typical small disadvantages of recursive algorithms. First, it would be slightly slower. Second, since recursive calls allocate resources in the system stack, in the case of traversals of very large trees, a stack overflow could happen during execution.

It seems to me that you agree. 

avk

  • Full Member
  • ***
  • Posts: 129
    • my self-education project
Re: TTree depth-first traversal
« Reply #17 on: September 23, 2019, 03:34:58 am »
So, can anyone show an iterative DFS implementation(equivalent) that would be faster than the recursive version?

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #18 on: September 23, 2019, 07:50:29 am »
As soon as possible I will do some tests to confirm (or deny) what is reported in scientific literature and on many programming forums. In the meantime, can you refute this thesis with technical arguments or quote authoritative sources, as I did?

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #19 on: September 23, 2019, 02:31:04 pm »
I do a simple test. This is NOT a rigorous benchmark. I have adapted the code in order to generate a tree with a very simple topology (two levels) but with a big number of nodes N=100000000 nodes: 1 root and n-1 children. This topology should be the best case for recursion approach, because should reduce the nesting level of recursive calls (but this issue requires some further investigation). I wrote a Recursive DFT algorithm (DFS is a special case of DFT). Unlike the iterative case, the Node parameter is required for the procedure. At the first call this parameter must be the tree root, while in subsequent recursive calls acts as the root of the sub-trees. To avoid noise, the callback procedure does nothing.

Using Lazarus 2.0.4+FPC 3.0.4 (64bits) on a PC with a Intel Core i7, I have following results:

Recursive DFT
Start Timer
End Timer
Time Elapsed: 75345

Iterarive DFT
Start Timer
End Timer
Time Elapsed: 55202

However it's only a simple test. I need further confirmations.

This is the code:
Code: Pascal  [Select]
  1. program project1;
  2. {$mode objfpc}{$H+}
  3.  
  4. uses
  5.     gtree, gvector, sysutils;
  6.  
  7. type
  8.  
  9.     TExpression = class(TObject)
  10.     public
  11.         value: integer;
  12.         constructor Create(v: integer);
  13.     end;
  14.  
  15.     TExpressionNode = class(specialize TTreeNode<TExpression>)
  16.         destructor Destroy; override;
  17.     end;
  18.  
  19.     { TExpressionTree }
  20.  
  21.     TExpressionTree = class(specialize TTree<TExpression>)
  22.       procedure DepthFirstTraverseRecursive(Node : TTreeNodeType; Callback: TDepthFirstCallbackType);
  23.     end;
  24.  
  25. var
  26.     tree: TExpressionTree;
  27.     n,n1 : TExpressionNode;
  28.     e,e1 : TExpression;
  29.     T0,T1 : Comp;
  30.     ind : QWord;
  31.  
  32. procedure TimerOn;
  33. begin
  34.   Writeln('Start Timer');
  35.   T0:=TimestampToMSecs(DateTimeToTimestamp(Now));
  36. end;
  37.  
  38. procedure TimerOff;
  39. begin
  40.   T1:=TimestampToMSecs(DateTimeToTimestamp(Now));
  41.   Writeln('End Timer');
  42.   Writeln('Time Elapsed: '+IntToStr(QWord(T1-T0)));
  43. end;
  44.  
  45. { TExpressionTree }
  46.  
  47. procedure TExpressionTree.DepthFirstTraverseRecursive(Node : TTreeNodeType; Callback: TDepthFirstCallbackType);
  48. var
  49.   Child: TTreeNodeType;
  50. begin
  51.   if Assigned(Node) then
  52.     begin
  53.       Callback(Node.Data);
  54.       for Child in Node.Children do
  55.         DepthFirstTraverseRecursive(Child,Callback);
  56.     end;
  57. end;
  58.  
  59.  
  60. constructor TExpression.Create(v: integer);
  61. begin
  62.     self.value := v;
  63. end;
  64.  
  65.  
  66. destructor TExpressionNode.Destroy;
  67. begin
  68.     self.Data.Free;
  69.     inherited;
  70. end;
  71.  
  72.  
  73. procedure WriteCallback(const e: TExpression);
  74. begin
  75.   //write(e.value, ' ');
  76. end;
  77.  
  78.  
  79. begin
  80.     Tree:=TExpressionTree.Create;
  81.  
  82.     e1:=TExpression.Create(0);
  83.     n1:=TExpressionNode.Create;
  84.     n1.Data := e1;
  85.     tree.Root := n1;
  86.     for ind:=1 to 100000000 do
  87.       begin
  88.         e:= TExpression.Create(ind);
  89.         n:= TExpressionNode.Create;
  90.         n.Data:=e;
  91.         n1.Children.PushBack(n);
  92.       end;
  93.  
  94.     writeln('Recursive DFT');
  95.     TimerOn;
  96.     Tree.DepthFirstTraverseRecursive(n1,@WriteCallback);
  97.     TimerOff;
  98.     writeln;
  99.  
  100.     writeln('Iterarive DFT');
  101.     TimerOn;
  102.     Tree.DepthFirstTraverse(@WriteCallback);
  103.     TimerOff;
  104.     writeln;
  105.  
  106.     tree.Free;
  107.  
  108.     readln;
  109.  
  110. end.
  111.  

avk

  • Full Member
  • ***
  • Posts: 129
    • my self-education project
Re: TTree depth-first traversal
« Reply #20 on: September 23, 2019, 02:42:30 pm »
My English is certainly terrible, but you should have noticed that I was interested specifically in the DFS implementation for the reason that I already explained. And I'm not going to refute CS stars at all and I trust their opinion, I am interested in the practical aspect: is it possible to somehow improve the situation I described above.
As for DepthFirstTraverse, I have already said that it is not equivalent to recursive DFS (for example, yours).

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #21 on: September 23, 2019, 03:48:09 pm »
Your English is certainly better than mine.  I didn't understand exactly what you need.  A search (for example using a DFS algorithm) on a data structure such as a list, a tree or a graph, unless It is ordered, requires a complete traversal of the structure (for example using a DFT algorithm).  In our example the library that implements the DFT allows the use of a callback procedure that can help to perform a search.

avk

  • Full Member
  • ***
  • Posts: 129
    • my self-education project
Re: TTree depth-first traversal
« Reply #22 on: September 23, 2019, 05:01:44 pm »
I'm trying to say that if we want to compare the performance of an iterative and recursive version of some algorithm, these versions should be equivalent, otherwise the comparison does not make sense. Of couse, DepthFirstTraverse does a complete traversal of the tree, without any doubt, but this is a different algorithm.

440bx

  • Hero Member
  • *****
  • Posts: 1129
Re: TTree depth-first traversal
« Reply #23 on: September 23, 2019, 05:03:31 pm »
is it possible to somehow improve the situation I described above.
The answer to that is _very likely_ to be, yes.  The performance gain in most cases will be minimal to negligible and, the code to implement an iterative solution will be noticeably more complex.  It's simply not worth it.

Just FYI, Sedgewick, in the second edition of his book "Algorithms" presents both, a recursive and an iterative implementation (Pascal pseudocode) of a DFS (neither is optimized.)  As expected, the iterative implementation is more complex than the recursive one and, as presented in the book, it's unclear which one would actually perform better.

I _believe_ that, with careful optimization of both algorithms, the iterative algorithm would be a smidgen faster (most likely not humanly perceptible in the great majority of cases.)  I believe that because, in a good iterative implementation, a push of the return address on the stack will no longer be necessary (in most cases, that isn't much of a performance gain.)


It seems to me that you agree. 
In general, yes, I agree with what you've stated.

However, there is one statement you've made that I disagree with, which is:
These types of operations are very expensive for the CPU.
There is obviously a cost associated with pushing "items" on a stack but, they really aren't "very expensive". 


using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

Thaddy

  • Hero Member
  • *****
  • Posts: 8919
Re: TTree depth-first traversal
« Reply #24 on: September 23, 2019, 05:17:55 pm »
In general iterative algorithms have better time complexity over recursion. But it will only show on *huge* amounts of data.
If there is a noticeable discrepancy between the two, likely  the slowest one has an inferior implementation.

(More on topic: left and right are a contract, keep to the contract and you are safe..)

Note
Quote
There is obviously a cost associated with pushing "items" on a stack but, they really aren't "very expensive".
Well, the obvious amazes me, since stack/queue are linked list based.  You can't improve on that, currently..... :'( 8-)
You are right, btw... But you should have added the details. Also, recursion is a hornet's nest of problems in the hands of all programmers....

« Last Edit: September 23, 2019, 05:32:44 pm by Thaddy »
Most people that want to use threading should learn to patch their jeans first: use a needle.

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #25 on: September 23, 2019, 05:54:44 pm »
I'm trying to say that if we want to compare the performance of an iterative and recursive version of some algorithm, these versions should be equivalent, otherwise the comparison does not make sense. Of couse, DepthFirstTraverse does a complete traversal of the tree, without any doubt, but this is a different algorithm.

Indeed in my test I compared a ricorsive DFT algorithm with an iterative  one


It seems to me that you agree. 
In general, yes, I agree with what you've stated.

However, there is one statement you've made that I disagree with, which is:
These types of operations are very expensive for the CPU.
There is obviously a cost associated with pushing "items" on a stack but, they really aren't "very expensive". 

Recursion, as well known, is implemented using stack. Every recursive call allocates a new 'stack frame' on the top of the stack. Each frame contains, among others, following information, for each execution of recursive procedure: the parameters values passed to procedure; the return address of the caller procedure; all the values of local variables of the procedure.  When recursion terminates, all these frames are deallocated, in reverse order. These operation are very expensive, since consume space of stack and time of CPU.

440bx

  • Hero Member
  • *****
  • Posts: 1129
Re: TTree depth-first traversal
« Reply #26 on: September 23, 2019, 07:24:38 pm »
Recursion, as well known, is implemented using stack. Every recursive call allocates a new 'stack frame' on the top of the stack. Each frame contains, among others, following information, for each execution of recursive procedure: the parameters values passed to procedure; the return address of the caller procedure; all the values of local variables of the procedure.  When recursion terminates, all these frames are deallocated, in reverse order. These operation are very expensive, since consume space of stack and time of CPU.
What you stated above is correct except for the "very expensive" part.

A "very expensive" sequence of instructions is, for instance, servicing an exception, _that_ is very expensive.

I suggest you do the following to give yourself an idea of how relatively "expensive" some instruction sequences are: in a loop - say about a million times - call a procedure.  First with no parameters then, with an increasing number of parameters.  Obviously, time that.

You may also want to do the above with a function that returns an ordinal type (integer for instance) and a function that returns a structured type of 32 bytes.

Compare the results you obtained above (for the various cases), with a loop (same number of executions) that simply executes the expression "a := b + c;" (assign random values to b and c before the loop or the compiler may optimize the code causing it to be executed only once.)

It is obviously true that there is a cost in setting up a stack frame but, it is _not_ a very expensive operation.  The only time, setting up the stack frame is "very expensive" as you put it is when the total size of the local variables cause the stack pointer to "bump into" (or go past) the bottom of the stack.  In that case, one or more exceptions will need to be serviced by the O/S to commit whatever additional memory is needed on the stack to accommodate the local variables.  That is the only time when setting up a stack frame can legitimately be considered to be "very expensive".

It should also be noted that if a program is concerned about the amount of stack space it is going to consume during a time sensitive operation, it can specify a "stack commit size" in the PE file large enough to ensure that no exceptions will be necessary to ensure that enough memory has been committed to it.

HTH.


« Last Edit: September 23, 2019, 07:28:21 pm by 440bx »
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #27 on: September 23, 2019, 07:36:25 pm »
'Expensive' is a relative term… However I agree with your statements.

julkas

  • Sr. Member
  • ****
  • Posts: 382
  • KISS principle / Lazarus 2.0.0 / FPC 3.0.4
Re: TTree depth-first traversal
« Reply #28 on: September 23, 2019, 08:20:35 pm »
What about {$optimization tailrec}? Any practical example?
procedure mulu64(a, b: QWORD; out clo, chi: QWORD); assembler;
asm
  mov rax, a
  mov rdx, b
  mul rdx
  mov [clo], rax
  mov [chi], rdx
end;

simone

  • Full Member
  • ***
  • Posts: 245
Re: TTree depth-first traversal
« Reply #29 on: September 23, 2019, 10:00:06 pm »
I did not know this switch. Thanks for pointing this out to me. I will try it.
« Last Edit: September 23, 2019, 10:09:16 pm by simone »