zenmumbler

Memoize This

I was watching the WWDC 2014 videos and in the Advanced Swift session they illustrated its generic programming power by implementing a memoize function that can take any (A) -> B function and then memoize it, even if that function is recursive.

I got inspired to quickly try my hand at converting their memoize function to C++14 to see how close I could get. The Swift version is quite elegant in that the syntax of creating a memoized function is very cruft-free as they say.

In their example they use factorial but I'll use the fibonacci function here instead. First the Swift version:

func memoize<T: Hashable, U>(body: ((T)->U, T) -> U) -> (T)->U {
    var memo = Dictionary<T, U>()
    var result: ((T)->U)!
    result = { x in
        if let q = memo[x] { return q }
        let r = body(result, x)
        memo[x] = r
        return r
    }
    return result
}

let fib = memoize {
    fib, n in
    return n < 2 ? Double(n) : fib(n - 2) + fib(n - 1)
}

The memoize function is taken straight from the presentation. The code is by Dave Abrahams, watch the whole video, it's excellent1.

The only weird bit here is that he had to declare the result variable before he could assign the method to it, otherwise it's pretty clean. I like the if let construct to test and unwrap an optional value a lot, it's a concise and safe idiom.

As you can see, the fib closure can reference itself in its own parameters to allow the recursive calls to also use the memoized version of the function instead of the original one. Additionally, you do not have to explicitly specify the types used by the function anywhere. On the user end it's quite clean.

Without further ado, here's my current go at the C++14 version:

template <typename T, typename U>
auto memoize(std::function<U(T)>&& f) {
    return [memo = std::unordered_map<T, U>(), f](const T& t) mutable {
        auto have = memo.find(t);
        if (have != memo.end())
            return memo[t];
        U val = f(t);
        memo[t] = val;
        return val;
    };
}

std::function<double(int)> fib = memoize<int, double>([&fib](int n) {
    return n < 2 ? double(n) : fib(n - 2) + fib(n - 1);
});

Implementation Notes

The Swift version takes the closure itself as an argument whereas in C++14 the lambda is captured as a std::function to refer to itself. I've read some articles about lambdas referring to themselves and it's a major pain so I decided not to spend a lot of time "fixing" this. std::function is sometimes avoided because of the feared extra overhead over a lambda, but the compiler (clang 3.4) completely eliminates this calling overhead in optimized builds in the tests I've done so far.

Consequently, I chose to make the parameter to memoize a std::function<U(T)> instead of having another type parameter F and taking a F&&. The optimizer completely eliminates any overhead and until we have concepts this allows the compiler to check the parameter and give better errors messages and it also is clearer in general.

The lambda is mutable because the memo unordered_map can't be const, of course. I also make use of the extended capture syntax for lambdas to directly initialize the memo member in the lambda.

The fib result has to be typed explicitly because I'm capturing it in its own definition. The memoize function returns the generated function as a lambda but it will be captured here in a std::function.

C++ can't automatically deduce T and U from the invocation as those two types are only exposed in memoize inside the std::function<U(T)> parameter. Perhaps it is possible to extract these in some clever way but I haven't tried that hard yet.

The end result is that the syntax for the C++14 version of the fib definition suffers from not only having to declare the in and out type explicitly, but also from having to do so twice, but…

Speed

In the presentation he also uses fibonacci as an example of a function to memoize. The constant φ (the Golden Ratio) is calculated as follows:

let φ = fib(45) / fib(44)

The running time to calculate φ is mentioned to be 11 seconds for the naive recursive version and 0.1 seconds for the memoized version. That by itself is pretty good, a ~100x speedup. I timed my C++14 version with the following function:

template <typename F>
void timeTest(F&& f) {
    namespace k = std::chrono;
    auto t0 = k::high_resolution_clock::now();
    f();
    auto t1 = k::high_resolution_clock::now();
    std::cout << "time: " << k::duration_cast<k::microseconds>(t1 - t0).count() << "µs\n";
}

I then did a quick run of both methods and took an average of a couple of runs. Like so:

double fibslow(int n) {
    return n < 2 ? double(n) : fibslow(n - 2) + fibslow(n - 1);
}

...

timeTest([] {
    std::cout << "φ = " << fibslow(45) / fibslow(44) << "\n";
});

And, using the fib lambda above:

timeTest([&fib] {
    std::cout << "φ = " << fib(45) / fib(44) << "\n";
});

Note that these time tests were far from exhaustive or scientific, but given the low precision of the times reported in the slide, this would suffice for now. All tests were run with -O3 -fno-exceptions settings.

My home machine took about 13 seconds to run the non-memoized test, but the memoized version completed in ~55µs — and that's micro, not milliseconds — or a ~235,000x speedup. Now, when I see low timings like this I get suspicious that the compiler just eliminated the whole function but using the excellent Hopper Disassembler I confirmed that it was really still running. It does make sense as the memoized version really is not a lot more than a couple of dictionary inserts and lookups.

So, though more verbose, the C++14 memoized version is in the order of 1500-2000x faster than Swift's version. FACT2

Also, more hilariously, when I tried the Swift version in an Xcode 6 beta playground, just the presence of the let fib = ... statement made it very unhappy. Beta IDE is beta.

Adding Restraints

I think with more finagling and probably by uglifying the memoize function the call site code can be simplified, but I'm actually quite content with this version. With C++11 and now 14, the language has gotten more terse. When the Concepts TS is ready, we can also restrict the T typename as such:

template <Hashable T, typename U>
auto memoize(std::function<U(T)>&& f);

Or something equivalent, I'm assuming Hashable will be a standard type restraint.

I'm also wondering if a concept like Callable could exist such that the argument type and return type are retrievable from the concept itself so that the explicit passing of the type parameters becomes unnecessary. Right now, I lack the knowledge to really investigate this.

Source

I uploaded a simple source file to play around with the code as a gist so if I piqued your interest I encourage you to play around with it.

  1. In fact, watch all 3 Swift videos to get a good feel for the whole language from the basics to generic functional features.
  2. Not actually a fact. The level of rigorousness and overall scienceworthyness of this article is low to very low. But no one reads footnotes anyway. Yay C++!!

Reply @zenmumbler to discuss this article.