Deep dive into Fibonacci computation in Ruby
Problem statement
Computing Fibonacci numbers ain’t exactly rocket surgery, and virtually anyone and their brother can come up with the naïve implementation of:
def fib(n)
raise "n negative" if n < 0
return n if n < 2
fib(n-1) + fib(n-2)
end
And anyone even slightly versed in complexity theory immediately knows
the issue with that naïveté1, the dreaded O(2^n)
.
And we can do better. Significantly better.
But, can we – at the same time – make the result:
- fast (
O(n)
) - prettier (easy/-ier to understand, idiomatic)
- more robust
?
Let’s dig in.
Exploration
Speed / complexity
A simple way to make the computation faster is to go from recursion to iterative algo:
def fib(n)
raise "n negative" if n < 0
return n if n < 2
a = b = 1
(n-2).times do
a, b = b, a + b
end
b
end
which is clearly linear – O(n)
.
But honestly, that’s still a whole lotta code, and it’s not very transparent.
You could argue that choosing better naming than a
, b
might help. But how
much? And there’s still the whole why do I run the iteration n-2
times, etc.
It’s a headache to think through because of all the magic constants.
Prettier
There’s a very powerful feature of Hash
I like to (ab)use whenever convenient:
The Hash.default_proc
, also known as the block given to Hash.new
:
# Old school Ruby hashing:
o = Hash.new
o['foo'] << :bar
# NoMethodError (undefined method `<<' for nil:NilClass)
(o['foo'] ||= []) << :bar # => [:bar], but I mean... yuck!
o # => {"foo"=>[:bar]}
# When the #default_proc came along:
a = Hash.new { |h, k| h[k] = [] }
a['foo'] << :bar # => [:bar]
a # => {"foo"=>[:bar]}
a['foo'] << :bar # => [:bar, :bar]
a # => {"foo"=>[:bar, :bar]}
and you might see where this is going:
def fib(n)
raise "n negative" if n < 0
fib_hash = Hash.new do |h, k|
h[k] = k < 2 ? k : h[k-1] + h[k-2]
end
fib_hash[n]
end
Which is much less code, almost no magic constants, and same speed.
In fact, since this code uses memoization, refactoring this a little further would get you a fibonacci that’s faster on subsequent uses; by trading (more) memory for (better) execution speed.
Of course, there’s one additional downside of the (implicitly) recursive setup:
fib(7000) # => 36643483050372328322763589672816049218...
fib(8000) # => SystemStackError (stack level too deep)
Which brings me to the…
Final form
In the finest of Ruby traditions, let’s make this memoized fibonacci into a class, and get some robustness benefits out of that.
class Fibonacci
# Default fibonacci values
DEFAULT_VALUES = {0 => 0, 1 => 1}
# Safe recursion threshold (max stack depth)
SAFE_THRESH = 5_000
def initialize
@h = Hash.new do |h, k|
h[k] = h[k-1] + h[k-2]
end
@h.merge!(DEFAULT_VALUES)
end
def [](n)
raise "n negative" if n < 0
# Prepopulate in SAFE_THRESH steps
@h.size.step(n, SAFE_THRESH) { |i| @h[i] }
# Final solution
@h[n]
end
end
def fib(n)
Fibonacci.new[n]
end
And with that one either runs out of memory2, or gets the answer. And when
you hold an instance of Fibonacci
class, then subsequent computations are
faster (or sometimes3 instant – O(1)
).
Benchmarking
Let’s benchmark this a little bit, just to see what’s what.
To that end I whipped up a benchmark script, including some variants that I didn’t even bother to discuss in the post4:
benchmark.rb (click to expand)
require 'benchmark'
def fib_iter(n)
raise "n negative" if n < 0
return n if n < 2
a = b = 1
until n < 3
a, b = b, a + b
n -= 1
end
b
end
def fib_iter2(n)
raise "n negative" if n < 0
return n if n < 2
a = b = 1
(n-2).times do
a, b = b, a + b
end
b
end
def fib_hash(n)
raise "n negative" if n < 0
fib_hash = Hash.new do |h, k|
h[k] = k < 2 ? k : h[k-1] + h[k-2]
end
fib_hash[n]
end
def fib_hash2(n)
raise "n negative" if n < 0
fib_hash = Hash.new do |h, k|
h[k] = h[k-1] + h[k-2]
end
fib_hash.merge!({0 => 0, 1 => 1})
fib_hash[n]
end
class Fibonacci
# Default fibonacci values
DEFAULT_VALUES = {0 => 0, 1 => 1}
# Safe recursion threshold (max stack depth)
SAFE_THRESH = 5_000
def initialize
@h = Hash.new do |h, k|
h[k] = h[k-1] + h[k-2]
end
@h.merge!(DEFAULT_VALUES)
end
def [](n)
raise "n negative" if n < 0
# Prepopulate in SAFE_THRESH steps
@h.size.step(n, SAFE_THRESH) { |i| @h[i] }
# Final solution
@h[n]
end
def self.[](n)
@@f ||= Fibonacci.new
@@f[n]
end
end
def fib_memo(n)
Fibonacci.new[n]
end
# This wouldn't even be a fair fight...
def fib_memo2(n)
Fibonacci[n]
end
for n in [1_000, 10_000, 100_000]
puts "Size: #{n}"
Benchmark.bmbm do |x|
[:fib_iter, :fib_iter2, :fib_hash, :fib_hash2, :fib_memo].each do |c|
x.report(c) { send(c, n) }
end
end
puts
end
And the results are not exactly surprising:
# We don't want Ruby to die because of stack size
$ ulimit -s unlimited
$ export RUBY_THREAD_VM_STACK_SIZE=100000000
# For posterity
$ ruby --version
ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]
# Run + strip away the "rehearsal"
$ ruby benchmark.rb | ruby -pe 'next if /-$/../^$/'
Size: 1000
user system total real
fib_iter 0.000076 0.000023 0.000099 ( 0.000098)
fib_iter2 0.000155 0.000000 0.000155 ( 0.000154)
fib_hash 0.000359 0.000000 0.000359 ( 0.000358)
fib_hash2 0.000354 0.000000 0.000354 ( 0.000353)
fib_memo 0.000320 0.000000 0.000320 ( 0.000320)
Size: 10000
user system total real
fib_iter 0.001435 0.000306 0.001741 ( 0.001730)
fib_iter2 0.002335 0.000000 0.002335 ( 0.002335)
fib_hash 0.004508 0.000000 0.004508 ( 0.004512)
fib_hash2 0.004507 0.000000 0.004507 ( 0.004512)
fib_memo 0.004126 0.000000 0.004126 ( 0.004137)
Size: 100000
user system total real
fib_iter 0.094116 0.000000 0.094116 ( 0.094280)
fib_iter2 0.089830 0.003735 0.093565 ( 0.093710)
fib_hash 0.377110 0.000175 0.377285 ( 0.377357)
fib_hash2 0.356803 0.000085 0.356888 ( 0.356990)
fib_memo 0.152263 0.004034 0.156297 ( 0.156368)
The final form (fib_memo
) starts out ~2x slower on small sizes
to the iterative approach (fib_iter2
)5, but with increased size
the difference shrinks.
What is also surprising is how fib_memo
ends up faster than fib_hash
even though they both use roughly the same thing. I guess memory management
due to deep recursion is expensive, and the SAFE_THRESH
-stepping avoids
some of that cost. Yay?
Anyway, the end result is —
If you want raw one-off speed, you still want to go with fib_iter
6.
But if you want to compute more than one Fibonacci number (and have the memory
for it), then fib_memo
with the memoization will eventually beat the hell out
of the iterative approach.
Closing words
There you go, something that was racking around in my brain for a while and [for] some reason, and I felt compelled to write it down, to make it stop.
-
If the last line of the function doesn’t make you convulse in terror, you might be very green. Or, perhaps, you’ve seen similar horrors so many times that you’ve become numb to them. I wonder which is worse… ↩
-
On my machine,
irb
after evaluation offib(500_000)
consumes about 20.8 GB RAM. While running viaruby
,time
tells me that the maxrss is about 10 GB. Go figure. ↩ -
Whenever you ask for something that was already computed previously. ↩
-
You’re welcome to explore them on your own, and figure out what I was driving at. You’re smart, you got this. ↩
-
I did not discuss the
fib_iter
version originally; that’s iterative but without the use of thetimes
block, which is initially faster, probably because it avoids new object creation. ↩ -
Actually, you don’t go with Ruby (but then you probably need a bignum library). Or you precompute a table. But anyway, humor me here. ↩