Body & tail recursion in Erlang
Old habits die hard. Some things can suprise you.
I was writing a couple small libraries in Erlang over Veterans Day. Random access lists, my favorite. I’d already written them in JavaScript, but why not give it a whirl. This time I called them binary_forest & skew_forest.
As a data structure built largely with lists, I wasn’t surprised to use some recursion. Due to the way I learned functional programming, I tend to use tail recursion. I mean, who wouldn’t, right? What this entails is usually breaking one function up into two: a non-recursive wrapper that gets exposed/exported & a helper that uses tail recursion.
This looks something like:
let my_double l =
let rec helper l acc = match l with
| [] -> List.rev acc
| h::t -> helper t ((2*h)::acc) in
helper l [];;
my_double [1; 2; 3];;
- : int list = [2; 4; 6]
That’s OCaml, not Erlang, but the idea remains the same. The actual implementation of my_double
is just a single call to helper
. helper
usually takes an additional argument, often an accumulator that it builds up as it recurses. The most efficient way to build up a list is to add to the beginning, not the end, so the first element processed actually ends up at the end of the accumulated list; one call to reverse
and we’re set. This is pretty common in Lisp/Scheme, too; not just OCaml. This seemed like the natural way to write Erlang; functional programming is functional programming, right?
Body recursion, enter stage left
I came across the Eight Myths of Erlang Performance. The third myth stood out to me: Myth: Tail-Recursive Functions are Much Faster Than Recursive Functions. Why would that be? How could that be?
I read a StackOverflow answer by Hynek -Pichi- Vychodil, and a blog post by Fred Hebert. That pretty much sealed the deal. The mailing list Fred links to is a good read. Body recursion is well-optimized & avoids the final list reversal (which takes time and memory).
I tested it myself, read what others have written, and I’ve got to agree: body recursion is fast. Body recursion does have the distinct benefit of being easier to understand (no more helper functions, no more reversing), so I rewrote my code.
Here’s what it looked like before (this from binary_forest):
update_trees(N, Value, Trees) -> update_trees(N, Value, Trees, []).
update_trees(N, Value, [Tree = #tree{size=Size} | Trees], Out) when N > Size ->
update_trees(N - Size, Value, Trees, [Tree | Out]);
update_trees(N, Value, [Tree | Trees], Out) ->
lists:reverse([update_tree(N, Value, Tree) | Out]) ++ Trees.
And after:
update_trees(N, Value, [Tree = #tree{size=Size} | Trees]) when N =< Size ->
[update_tree(N, Value, Tree) | Trees];
update_trees(N, Value, [Tree = #tree{size=Size} | Trees]) ->
[Tree | update_trees(N - Size, Value, Trees)].
The second version just seems much more obvious. Here are the commits: one for binary_forest, one for skew_forest. I think I’ll stick to body recursion if I can help it.