Ekaitz's tech blog:
I make stuff at ElenQ Technology and I talk about it

Call me maybe

Do you remember what happens when you call a function in your program?

What happens when you make too many nested calls?1

When you call your functions, there some stuff going on in the memory, some variables, the program counter and all that, that must be stored somewhere to be able to come back to the place you called the function when the function ends. Right?

The place where all that is stored is the stack. You already know all this. When you call many nested functions, the stack goes pushing more and more data, and there’s no chance to pop it, so it overflows.

This can happen anytime but there’s more risk for that when you call functions recursively, because they call themselves many times by definition. In a non-recursive program it can happen too, but devices can handle big levels of nesting so it’s more unlikely to happen (in small devices like microcontrollers or so, you have to take care of this too).

This doesn’t mean recursive functions will result in a stack overflow always. That will only happen when the nesting level is bigger than the stack size.

You are so stupid the recursive function that calculates your stupidity causes a stack overflow.
— Heard in a computer science class

But this is not always true. There are some optimizations that can change this behaviour and allow you to create stack-safe recursions. Let’s talk about tail-call optimization.

Some programming languages implement tail-call optimization, that, if used correctly, avoids stack overflows in recursive calls and increase performance. First of all, in order to be able to make a tail-call optimization, the function must have a call as its last action (tail-call). This means it requires to be ordered in an specific way. Let’s see it with an (oversimplified) example (in Python, but don’t pay attention to the language):

def factorial (a):
    """ This function does not provide a tail-call, because the last thing
    to execute in it is a multiplication, not a call """
    if a == 1:
        return 1
    return a * factorial(a-1)

def factorial_tail (a, acc=1):
    """ This function provides a tail-call, the last thing happening on it
    it's a function call."""
    if a == 1:
        return acc
    return factorial_tail(a-1, acc=acc*a)

As the comments say, the first function is not performing a tail-recursion, but the second is. But, what’s the difference?

The main point is the first function, factorial, needs to go back in the call stack to retrieve previous step’s a value, while the second function doesn’t. That’s why the second can be optimized and the first not.

The optimization exploits this behaviour in a really clever way to avoid the stack overflows I told you before. Tail call optimization just changes the input parameters of the function and calls it again, replacing the original call with the new call with different input arguments. This can be made because the function is written in a way that doesn’t need anything from the previous step.

Imagine that we introduce a 3 in the first and the second function, let’s compare the execution. Let’s check factorial first:

  • Call factorial(3)
    • Call factorial(2)
      • Call factorial(1)
      • Return 1
    • Return 2 * 1
  • Return 3 * 2

Now with the factorial-tail function but without any optimization:

  • Call factorial-tail(3)
    • Call factorial-tail(2, acc=3)
      • Call factorial-tail(1, acc=6)
      • Return 6
    • Return 6
  • Return 6

See the difference?

The factorial-tail call doesn’t need anything from the previous step, the last factorial-tail(1, acc=6) function call’s result is the same as the result of the factorial-tail(3) function. That changes everything!

What tail call optimization does is just change the call arguments and keep running the same code. There’s no need to store anything on the stack, just change the function call with the tail call.

Let’s optimize the second call now:

  • Call factorial-tail(3)
  • Replace the call with factorial-tail(2, acc=3)
  • Replace the call with factorial-tail(1, acc=6)
  • Return 6

This can be stretched further! It can involve different functions! In any place where a tail-call is made, even if the called function is a different function, this kind of optimization can be done, reducing the stack size and increasing the performance.

If you want to read more about this, there’s a great wikipedia page on the subject and you there’s a really good explanation in the book Programming in Lua.

But how is all this handled by the programming languages? You may ask.

The answer is there’s not a clear answer, all of them have their own style of dealing with this. Let me give you some examples.

Python, just to point out the language I chose for the example is no the best example of this, has no tail recursion elimination. Guido and the Pythonists2 argue that tail call optimization alters the stack traces (which is true) and that they don’t like the recursion as a base for programming, so they try to avoid it. In CPython there’s no tail call optimization, but they don’t forbid (they can’t!) any other Python implementation to implement that particular optimization. There’s a really interesting post by Guido Van Rossum about this.

Lua, as you’ve seen in the previous link, implements proper tail calls (as they call them there) and there’s nothing the programmer needs to do to make sure they are optimized. The only thing is to put the tail calls correctly.

Scala implements tail recursion optimization at compile time so the compiler transforms the recursive call with a loop in compilation time. That’s interesting because there’s a compile time check too. There’s an annotation called @tailrec that can be used to make sure that your function is going to be optimized. If the compiler is not able to optimize it will throw an error if it has the @tailrec annotation. If it doesn’t have it, it will simply make a standard recursion. In the annotations tour of the Scala language has some words about @tailrec.

Clojure is a really interesting case too. Clojure doesn’t implement tail-call optimization, but it has one (and only one) special form for non stack consuming looping: recur. This special form rebinds the recursion point’s bindings or arguments and jumps back to the recursion point. The recursion point can be a loop special form or a function definition. So, it’s just an explicit call to the tail recursion optimization. Tail call must be done correctly too, recur is only allowed in a tail call and the compiler checks if it’s located in the correct place. Also, it has some specific rules that must be taken in consideration (multiple arity functions and so on), that is better to read in the documentation.

Edited 2019-01-25: Thanks to a discussion in the fediverse about the topic, I found the moment where Emacs Lisp got its tail call optimization everything explained in the author’s blog. It’s really interesting.


  1. Some help: what’s the name of the website you check when you don’t know how to solve your programming problem? 

  2. My next music band name.