grumpy old programmer: Recursion in Python

And now something completely different. During last week I was busy polishing my Python skills or rather removing rust from them. I went through book Think Python: How to Think Like a Computer Scientist by Allen Downey. Mentioned book is available for download from http://www.thinkpython.com. My impression is that book is intended for army of different bookkeepers and alike trying to learn programming. So author is threading rather gently and slowly. It reminds me of joke from Monty Python where accountant, Palin, wants to go directly into taming lions, but career consultant, Cleese won’t let him, he must do that gradually, first investment banking. Now I will be kind and let accountant do lion taming without going first into investment banking.
In one of introductory chapters recursion is introduced and stack transition is explained. For example we can use factorial:

def fac(n):
 if n <= 1 :
  return 1
 else:
  return n*fac(n-1)

It is easily readable and looks good. If we call it with n=6 this will happen:

6*(5*(4*(3*(2*1))))

CPU pushes on stack value for n and * operation and dispatches call to the same function but with n-1 parameter, when right brace is closed, pops frame from stack and finishes multiplication. Problem with recursion and Python is that depth of stack is rather small, only thousand frames. If you try to calculate factorial for 2^10 using this function it will cause stack overflow. So instead of doing recursive calls one can rewrite factorial and use iterations:

def fac(n):
 res = 1
 while n>1:
  res *= n
  n-=1
 return res

Still simple and shallow stack problems are avoided, but that is so investment banking. What else could be done? We can use binary splitting. Story about binary splitting comes from Xavier Gourdon and Pascal Sebah and you can find it here. Instead of multiplying accumulator with all numbers in range we do recursive preparation similar to binary search and build tree like flow to multiply partial products between themselves. We define product like this:

P(a, b) = P(a, (a+b)/2)*P((a+b)/2, b)

Then we divide each range in half until distance between upper and lower bound is small. Here is the code:

def product(a, b):
 d = b - a
 m = (a+b)/2
 result = 0
 if d > 1 :
  result = product(a,m)*product(m,b)
 elif d == 1 :
  result = b
 else :
  result = 1
 return result

If we say a is 0 we will get factorial b as result. I tested it for b up to 2^16 and it went through without stack overflow as expected. While it theoretically could be used for values up to 2^1000, try not to exceed 2^16 unless your box have really good cooling and plenty of time to wait for result. Tree is not tall, but there is quite few branches. Python is partly functional language and that bring us to yet another idea. Functional languages have compilers which will optimize recursive calls into so called tail calls if function is written in certain way. For example this could be optimized:

def fac(n, a=1):
 if n<2 :
  return a
 else:
  return fac(n-1, n*a)

But Python compiler is not that much functional and it will not generate tail calls. Through the years people are bothering Guido with tail call optimisations but he just doesn’t want to implement it. When gccpy front-end is released, there will be tail call optimisation since GCC is doing it anyway. Instead of waiting for gccpy we can help ourselves with generators and so called trampolining. We should also switch from factorial to Fibonacci numbers, due to size of result. Plain Fibonacci looks like this:

def fibo(n):
    if n<2:
        return n
    else:
        return fibo(n-1)+fibo(n-2)

In order to get generators we will replace return with yield and rewrite function so it ends with single function call:

def fibo(a, b, c):
    if c==0:
        yield b
    else:
        yield fibo(a+b, a, c-1)

and one would like to call it like this fibo(1, 0, n). Now we just need trampoline to execute those tail calls and we can calculate first thousand or so Fibonacci numbers:

import types

def trampoline(func, *args, **kwargs):
    f = func(*args, **kwargs)
    while isinstance(f, types.GeneratorType):
        f=f.next()
    return f

def fibo(a, b, c):
    if c==0:
        yield b
    else:
        yield fibo(a+b, a, c-1)

for x in range(1024):
    print "F(", x, ") =", trampoline(fibo, 1, 0, x)

There is user friendly version of trampoline where decorator is used, so no need to replace return with yield and normal function call is used, but is more difficult to understand, you can download it from here.

grumpy old programmer

Saturday, March 16, 2013

Recursion in Python

No comments:

Post a Comment

About Me

Blog Archive