Thursday, May 14, 2015

Recursion

Got recently homework to do as part of interview process, already described it here. After providing them with working solution, thoroughly tested, they decided not to speak with me. My conclusion is that was not so bad outcome.
While searching web for clues I discovered this Find all paths from source to destination.
Guy submits his code, peers do review, everything sounds nice.
I will refrain from reviewing particular solution but after looking at it I decided to write about recursion. My impression is that young generations of programmers are putting great effort into mastering frameworks and new features of programming languages but in the same time somehow missing basics. It also coincides with my exploration of new functional features of Java 8.
Recursion is when method, in Java, or function, in C, calls itself during execution. Execution of caller is suspended until callee finishes, then it proceeds. So, there must be some natural way for recursive calls to stop propagating themselves into infinity. If we are not traversing some kind of tree structure, or marking vertices as visited when traversing graph, we typically passing variable to control depth of recursion. Trivial example of recursion is calculation of factorial:

n! = n*(n-1)*(n-2)*...*2*1

It is defined for positive integers. Here is natural and naive implementation together with test:


I could throw exception on negative n and assert is 9! equal to 362880. What stops recursion here from proceeding forever is that if statement in combination with decreasing n in recursive calls. Now in order to visualize how execution looks like we will add some debugging code.


Code now prints current value of n and number of dashes is how deep we are into recursion. We can see how stack of frames is growing in debugger as well but this is nicer. Output of execution is:

9
-8
--7
---6
----5
-----4
------3
-------2
--------1
-------2
------3
-----4
----5
---6
--7
-8
9
9! = 362880


As expected it behaves in linear fashion. It can be much more interesting than linear, for that we will calculate Fibonacci numbers. In the case of Fibonacci numbers we have the following recursive definition:

Fn = Fn-1 + Fn-2

Trivial and naive implementation looks like this:


Execution of this code will take  about second or two. Using 42 as function argument illustrates point that naïve implementation is not really most efficient. Mechanics of stopping recursive calls is identical to one in the case of factorial. Let us insert debugging code and see what is going on.


For reasonably sized argument of 2 we have this output:

2
-1
2
-0
2
F2 = 1


Argument 2 is decreased and recursive call is made with argument 1. When it returns to level 0 with result 1 next recursive call is made with argument 0. The second call also returns with result 0 to level 0 and we have 1+0=1 as result.
We try now 3 as argument:

3
-2
--1
-2
--0
-2
3
-1
3
F3 = 2


We start on level 0 with 3-1 call, then we have pattern from last run elevated for one level up, return to level 0 with result 1 and finally 3-2 recursive call and its return with result 1 to level 0. We can make conclusion that recursive calls are building binary tree. We can try argument 4 to recognize patterns for 3 and 2 and so on.
We could achieve the same on the factorial if we were splitting chain multiplication in half, I was writing about that here.
Recursive call does not have to return anything, work can be performed on one of function arguments. For example we can reverse array by swapping elements.


Here recursion stops when lo is bigger than hi.

Optimization

We can improve on naïve implementation using tail call elimination. We write method in such way that return statement is pure function call and that allows compiler to perform optimization and practically discard frames without placing them onto the stack. For example factorial can be rewritten like this:


We provide 1 as initial value for accumulator, n is the same as before. For Fibonacci we will need two accumulators:


Initial values are a = 0, b = 1 and n as before. From this form is quite easy to switch to iterative form.


About other forms of optimization like memoization, repeated squaring and lambdas, next time.

No comments:

Post a Comment