In one of introductory chapters recursion is introduced and stack transition is explained. For example we can use factorial:
It is easily readable and looks good. If we call it with n=6 this will happen:
6*(5*(4*(3*(2*1))))
CPU pushes on stack value for n and * operation and dispatches call to the same function but with n-1 parameter, when right brace is closed, pops frame from stack and finishes multiplication. Problem with recursion and Python is that depth of stack is rather small, only thousand frames. If you try to calculate factorial for 2^10 using this function it will cause stack overflow. So instead of doing recursive calls one can rewrite factorial and use iterations:
Still simple and shallow stack problems are avoided, but that is so investment banking. What else could be done? We can use binary splitting. Story about binary splitting comes from Xavier Gourdon and Pascal Sebah and you can find it here. Instead of multiplying accumulator with all numbers in range we do recursive preparation similar to binary search and build tree like flow to multiply partial products between themselves. We define product like this:
P(a, b) = P(a, (a+b)/2)*P((a+b)/2, b)
Then we divide each range in half until distance between upper and lower bound is small. Here is the code:
If we say a is 0 we will get factorial b as result. I tested it for b up to 2^16 and it went through without stack overflow as expected. While it theoretically could be used for values up to 2^1000, try not to exceed 2^16 unless your box have really good cooling and plenty of time to wait for result. Tree is not tall, but there is quite few branches. Python is partly functional language and that bring us to yet another idea. Functional languages have compilers which will optimize recursive calls into so called tail calls if function is written in certain way. For example this could be optimized:
But Python compiler is not that much functional and it will not generate tail calls. Through the years people are bothering Guido with tail call optimisations but he just doesn’t want to implement it. When gccpy front-end is released, there will be tail call optimisation since GCC is doing it anyway. Instead of waiting for gccpy we can help ourselves with generators and so called trampolining. We should also switch from factorial to Fibonacci numbers, due to size of result. Plain Fibonacci looks like this:
In order to get generators we will replace return with yield and rewrite function so it ends with single function call:
def fibo(a, b, c):
if c==0:
yield b
else:
yield fibo(a+b, a, c-1)
and one would like to call it like this fibo(1, 0, n). Now we just need trampoline to execute those tail calls and we can calculate first thousand or so Fibonacci numbers:
import types
def trampoline(func, *args, **kwargs):
f = func(*args, **kwargs)
while isinstance(f, types.GeneratorType):
f=f.next()
return f
def fibo(a, b, c):
if c==0:
yield b
else:
yield fibo(a+b, a, c-1)
for x in range(1024):
print "F(", x, ") =", trampoline(fibo, 1, 0, x)
There is user friendly version of trampoline where decorator is used, so no need to replace return with yield and normal function call is used, but is more difficult to understand, you can download it from here.
No comments:
Post a Comment