Visit Club Stardazed

March 3, 2016

Stardazed has matured a bit, grown a twin engine and is now a real organisation on Github. In addition, I figured I like speaking more in the voice of a faceless organisation instead of from my own perspective.

Please visit the Stardazed Blog for the latest updates on the framework and follow SD on Twitter for your daily enlightenment.

And really, who thought that making a game a week was good idea? Not me, I tell ya!

One Game A Week

September 7, 2015

I participated recently in the 33rd edition of Ludum Dare, an online simultaneous game compo and game jam. This was the second time I entered, first time being for the 25th edition, and as chance would have it the themes were very similar. Nr. 25 was You are the Villain and Nr. 33 was You are the Monster. This meant that for both editions the effective overall theme was "role-reversal". You play as an enemy and the normal player characters in the games are AI controlled.

My entry followed that trend as I couldn't come up with anything better in the morning. In Pinky's Nightmare you play as Pinky the ghost of Pac-man fame. Pinky is trapped in the pac maze and he's locked out of the ghost base and has to find keys to open it up while being chased by undead Pac-abominations. You can play it online in any recent browser. Just be warned that these Jam games tend to be less accessible than what you may be used to.

So, while not the most original of ideas, one of the main goals of LD is to create a game that has a beginning, middle and end in a very short amount of time (3 days in this case) and, while they recommend not learning new things, I took both times as an opportunity to at least use some kind of technology I hadn't used before. This time that was WebGL.

After learning and reading about "normal" OpenGL 4 APIs, using WebGL is kind of refreshing. Based off the OpenGL ES 2 spec, there's really not that much there. While this precludes using more advanced techniques it's also interesting as you don't have to worry about which "optimal" combination of APIs to use as there really is only one way to do many things. Like Metal and Vulkan, WebGL removes a lot of esoteric features, but unlike those other APIs, you don't get anything in return :) And of course, since you run in a browser context, you are more removed from the hardware than in any native GL implementation. But it sure is easy to to work with. Once you have some core code in place it is great for quick prototypes.

The whole process reminded me again of the importance of shipping, so for the next several weeks I'm going to build one small game or app each week, each time revolving around some (technical) theme of my choosing. I'll decide upon the exact theme at the beginning of each week, but they will very likely include one or more of the following, in no particular order:

Etcetera. Basically, a framework or technology combined with a game theme each time. Native and web. I might repeat one or more themes if I feel like it, but I will have a functional thing ready at the end of each week. Some might suck and some might be interesting enough to pursue in more detail. It's all about gaining experience and having fun.

For this first week I'm starting off with WebGL and Arcade Racing. Once I've got some info or pages up I'll post those here and on my Twitter.

A Struct of Arrays Type in C++14

August 3, 2015

In data-oriented design the efficient layout of data in memory is the primary driver for designing software. Main memory accesses are now so slow compared to L1/2/3 caches that cache misses can become an overriding factor in the speed of any system that deals with large arrays of items.

I was reading the excellent articles by BitSquid and was motivated to change my own component design in my stardazed engine. Now, unlike some people in game dev, I'm not averse to C++ and/or templates. I do use them sparingly and not to the heights found in some Boost libraries. I've got another post brewing about this, but I digress.

When I did some prototype test conversions of my existing types, I noticed a usage pattern returning every time. It is related to one of the main tenets of data-oriented design, namely that instead of an Array of Structs, you use a Struct of Arrays. For example, given an example struct like:

struct Transform {
    Vec3 position;
    Quat rotation;
    Vec3 scale;
    Mat4 modelMatrix;

If you look at the data layout of an array of Transforms in memory it would be this:

[position] [rotation] [scale] [modelMatrix] [position] [rotation] [scale] [modelMatrix] ...

and when you iterate over the contents of the array and are just looking at the matrices the CPU will load the entire struct into the cache but the other fields might as well not be there. This causes a cache miss and thus delay for likely every iteration because you're wasting cache space on data that is not needed at the moment.

Conversely, in a Struct of Arrays setup the layout is as follows:

struct Transform {
    Vec3 positions[];
    Quat rotations[];
    Vec3 scales[];
    Mat4 modelMatrices[];

Each field is separated out into its own contiguous array. Where things get a bit tricky is in the requirement to have all the arrays allocated together in a single allocation. The principle is not hard, you allocate a single buffer that can hold N items of each array, then calculate the base pointers for each array and from there on just use those. E.g.:

size_t numItems = 16384;
size_t combinedSize = sizeof(Vec3) + sizeof(Quat) + sizeof(Vec3) + sizeof(Mat4);
void* instanceData = allocator.alloc(combinedSize * numItems);

auto positionBase = static_cast<Vec3*>(instanceData);
auto rotationBase = reinterpret_cast<Quat*>(positionBase + numItems);
auto scaleBase = reinterpret_cast<Vec3*>(rotationBase + numItems);
auto modelMatrixBase = reinterpret_cast<Mat4*>(scaleBase + numItems);

// ... read/write data to offsets from base pointers
// somewhere down the line implement resizing the block with multiple memcpys, etc.

Additionally, with dynamic data blocks you will run out of array space and will need to expand the array space of all arrays together. The setup and resizing of this data is grunt work, but when done manually each time you implement this in a different component it can be error prone.

Enter the inventively named MultiArrayBuffer, a variadic template type that will allocate and resize the buffer and give you properly typed base pointers on demand. Like I said, going overboard with template stuff is not what I usually do, but I value the service MAB gives me in code that uses it over the chicanery I had to do to make it work, which is all contained in the implementation of a very low-level helper type. This is how the above setup code is done using MAB:

MultiArrayBuffer<Vec3, Quat, Vec3, Mat4> instanceData{ allocator, 16384 };

auto positionBase = instanceData.elementsBasePtr<0>();
auto rotationBase = instanceData.elementsBasePtr<1>();
auto scaleBase = instanceData.elementsBasePtr<2>();
auto modelMatrixBase = instanceData.elementsBasePtr<3>();

// ... read/write data to offsets from base pointers
// somewhere down the line:

The elementsBasePtr<N>() method returns a typed pointer to the Nth array in the buffer. This method is also trivial enough that you don't need to cache the start pointers per se, potentially simplifying usage code.


MAB being a variadic template means having to deal with the somewhat awkward recursion of types using templated helpers1. I will highlight a few things in this article but read the full implementation if you're interested.

The first thing I had to calculate was the total byte size of each parameterised type to mirror the sequence of sizeof(T)s in the manual version. Here's how I did2:

template <typename T>
constexpr size_t elementSumSize(T* = nullptr) {
    return sizeof(T);

template <typename T, typename... Ts>
constexpr size_t elementSumSize(T* = nullptr, Ts*... ts = nullptr) {
    return sizeof(T) + elementSumSize<Ts...>(std::forward<Ts*>(ts)...);

// ... in the class definition:

template <typename... Ts>
class MAB {
    static constexpr size_t sumSize = elementSumSize<Ts...>();

Quite straightforward with the notable exception of the annoying (defaulted) parameters I had to pass and also forward to make things compile. Omitting the parameters is not possible using this method as the two function calls become ambigious, I also tried this with structs but hit the same limitation. I use functions as they are less verbose than using structs.

The main challenge of this type came when I needed to implement resizing the buffer. To do so, I have to iterate over each base pointer and copy data to a new buffer, this combines the compile-time templated functions with run-time actions in a not entirely intuitive way. Here is the code I used to handle it:

// -- Call a callback with base pointer and element size info for each T

using ItFn = std::function<void(void*, uint32)>;

template <typename T>
void eachArrayBasePtr(void* basePtr, size_t /*capacity*/, const ItFn& fn, T* = nullptr) {
    fn(basePtr, sizeof(T));

template <typename T, typename... Ts>
void eachArrayBasePtr(void* basePtr, size_t capacity, const ItFn& fn, T* = nullptr, Ts*... ts = nullptr) {
    fn(basePtr, sizeof(T));

    auto bytePtr = static_cast<uint8_t*>(basePtr);
    eachArrayBasePtr<Ts...>(bytePtr + (capacity * sizeof(T)), capacity, fn, std::forward<Ts*>(ts)...);

// ... at the usage site:

newData = /* newly allocated buffer */;
newCapacity = /* number of items in the new arrays */;
data_ = /* current buffer */;
count_ = /* count of used elements in the arrays */;

eachArrayBasePtr<Ts...>(data_, capacity_,
    [newDataPtr = (uint8_t*)newData, newCapacity, usedCount = count_]
    (void* basePtr, size_t elementSizeBytes) mutable {
        memcpy(newDataPtr, basePtr, usedCount * elementSizeBytes);
        newDataPtr += elementSizeBytes * newCapacity;

The eachArrayBasePtr<Ts...>() helper essentially recurses over the Ts and bumps a base pointer to the next array by using the sizeof(T) and the capacity that the caller has to supply. Before each iteration, a callback function is called with the current array base pointer and the size of the currently pointed-at element. Types are eschewed in this setup as they were not necessary and it would complicate things quite a bit. What that does mean is that MultiArrayBuffer can only be safely used for TriviallyCopyConstructible types. In fact, ideally you would use (almost) completely trivial types.

Other containers in my library properly accomodate non-trivial types but given the intended usage of this container, it is reasonable not to support them. If you have a large array of tens or hundreds of thousands of elements, it is simply not wise to use types with user specified destructors. POD types are the way to go for this; put the logic in the manager class maintaining the instance data instead. It also follows the philoshopical idea of SoA: these are trivial types that are initialized and processed en masse by other functions. Individual constructors and destructors at scale will annihilate the efficiency gains of this data layout.

If you have comments or see horrible mistakes or misconceptions, throw me a line on Twitter.

  1. Haters, I understand why you don't like this, I really do. I'm just not that bothered by it. Each tool has its use.
  2. In the real type, I use 32-bit sizes and some other helper functions, but the essence is the same.


August 9, 2014

For several weeks now I've been working on a 3D game library called stardazed.

“Yet another 3D game engine? Just use Unity or subscribe to the newly open-sourced Unreal Engine 4! Dozens of programmers working it. You can't keep up with that. Etc. etc.” nobody in particular said, but they would if I communicated with other humans. The bastards.

No, the why of this project is not about an easy way to make Yet Another Angry Yet Flappy Birds clone-game, but now in 3D! but rather to have fun. As I mention in the repository readme this is my raison d'être:

Learning new skills by implementing concepts

I've been programming for about 30 years now — essentially my entire life — but there is always something new or unfamiliar. In this case:

Modern 3D engine math and physics

Current high-end game engines not only wrangle tons of vertexes, normals, textures and quaternions, but also employ lighting techniques that approach full ray-tracing complexity with all the lighting fun that accompanies it.

New GPU access APIs

New, “thinner” APIs are becoming the norm, yielding more control and thus more work to the engine implementers and opening up more of the full rendering and compute power of modern GPUs.

Complex physics

If you've played 3D games in the last couple of years, chances are you've seen a lot of boxes. Boxes being stacked, boxes falling off other boxes, boxes serving as make-shift platforms, boxes being shot and shifting position slightly, you name it. Game engines these days have highly sophisticated physics engines that enable boxes to interact in the most realistic way possible. Boxes1, the building blocks of life.

If I sound a bit hand-wavy, that would be exactly right! I'm learning all this stuff as I go, that's the whole point, and again, the fun of this project.

And then of course there is the library itself, that must function to provide access to all these features, and quite a few more, to the app. Building the library, with many re-revisions along the way, is just plain fun!

Keeping it real

Now, I indeed cannot keep up with a full team of programmers who have been writing 3D engines since 1996, so I'm keeping things as exclusive as possible:

Basically, I only make a part of the library if it tickles my fancy and I focus on current-day techniques and hardware.

Fly away with me

I really, honestly, fo-realz this time, plan to keep this blog updated as I fall down this rabbit hole. If you are also busy in this area or if you are an expert and read me spout total nonsense, please please please send me a tweet and we can all come out of this as better people.

I'm at GDC Europe this year and hope to meet some new people there too and if it's the future and I've already met you there: Hi, welcome to my blog, stay in touch!

  1. and crates.

Memoize This

June 4, 2014

I was watching the WWDC 2014 videos and in the Advanced Swift session they illustrated its generic programming power by implementing a memoize function that can take any (A) -> B function and then memoize it, even if that function is recursive.

I got inspired to quickly try my hand at converting their memoize function to C++14 to see how close I could get. The Swift version is quite elegant in that the syntax of creating a memoized function is very cruft-free as they say.

In their example they use factorial but I'll use the fibonacci function here instead. First the Swift version:

func memoize<T: Hashable, U>(body: ((T)->U, T) -> U) -> (T)->U {
    var memo = Dictionary<T, U>()
    var result: ((T)->U)!
    result = { x in
        if let q = memo[x] { return q }
        let r = body(result, x)
        memo[x] = r
        return r
    return result

let fib = memoize {
    fib, n in
    return n < 2 ? Double(n) : fib(n - 2) + fib(n - 1)

The memoize function is taken straight from the presentation. The code is by Dave Abrahams, watch the whole video, it's excellent1.

The only weird bit here is that he had to declare the result variable before he could assign the method to it, otherwise it's pretty clean. I like the if let construct to test and unwrap an optional value a lot, it's a concise and safe idiom.

As you can see, the fib closure can reference itself in its own parameters to allow the recursive calls to also use the memoized version of the function instead of the original one. Additionally, you do not have to explicitly specify the types used by the function anywhere. On the user end it's quite clean.

Without further ado, here's my current go at the C++14 version:

template <typename T, typename U>
auto memoize(std::function<U(T)>&& f) {
    return [memo = std::unordered_map<T, U>(), f](const T& t) mutable {
        auto have = memo.find(t);
        if (have != memo.end())
            return memo[t];
        U val = f(t);
        memo[t] = val;
        return val;

std::function<double(int)> fib = memoize<int, double>([&fib](int n) {
    return n < 2 ? double(n) : fib(n - 2) + fib(n - 1);

Implementation Notes

The Swift version takes the closure itself as an argument whereas in C++14 the lambda is captured as a std::function to refer to itself. I've read some articles about lambdas referring to themselves and it's a major pain so I decided not to spend a lot of time "fixing" this. std::function is sometimes avoided because of the feared extra overhead over a lambda, but the compiler (clang 3.4) completely eliminates this calling overhead in optimized builds in the tests I've done so far.

Consequently, I chose to make the parameter to memoize a std::function<U(T)> instead of having another type parameter F and taking a F&&. The optimizer completely eliminates any overhead and until we have concepts this allows the compiler to check the parameter and give better errors messages and it also is clearer in general.

The lambda is mutable because the memo unordered_map can't be const, of course. I also make use of the extended capture syntax for lambdas to directly initialize the memo member in the lambda.

The fib result has to be typed explicitly because I'm capturing it in its own definition. The memoize function returns the generated function as a lambda but it will be captured here in a std::function.

C++ can't automatically deduce T and U from the invocation as those two types are only exposed in memoize inside the std::function<U(T)> parameter. Perhaps it is possible to extract these in some clever way but I haven't tried that hard yet.

The end result is that the syntax for the C++14 version of the fib definition suffers from not only having to declare the in and out type explicitly, but also from having to do so twice, but…


In the presentation he also uses fibonacci as an example of a function to memoize. The constant φ (the Golden Ratio) is calculated as follows:

let φ = fib(45) / fib(44)

The running time to calculate φ is mentioned to be 11 seconds for the naive recursive version and 0.1 seconds for the memoized version. That by itself is pretty good, a ~100x speedup. I timed my C++14 version with the following function:

template <typename F>
void timeTest(F&& f) {
    namespace k = std::chrono;
    auto t0 = k::high_resolution_clock::now();
    auto t1 = k::high_resolution_clock::now();
    std::cout << "time: " << k::duration_cast<k::microseconds>(t1 - t0).count() << "µs\n";

I then did a quick run of both methods and took an average of a couple of runs. Like so:

double fibslow(int n) {
    return n < 2 ? double(n) : fibslow(n - 2) + fibslow(n - 1);


timeTest([] {
    std::cout << "φ = " << fibslow(45) / fibslow(44) << "\n";

And, using the fib lambda above:

timeTest([&fib] {
    std::cout << "φ = " << fib(45) / fib(44) << "\n";

Note that these time tests were far from exhaustive or scientific, but given the low precision of the times reported in the slide, this would suffice for now. All tests were run with -O3 -fno-exceptions settings.

My home machine took about 13 seconds to run the non-memoized test, but the memoized version completed in ~55µs — and that's micro, not milliseconds — or a ~235,000x speedup. Now, when I see low timings like this I get suspicious that the compiler just eliminated the whole function but using the excellent Hopper Disassembler I confirmed that it was really still running. It does make sense as the memoized version really is not a lot more than a couple of dictionary inserts and lookups.

So, though more verbose, the C++14 memoized version is in the order of 1500-2000x faster than Swift's version. FACT2

Also, more hilariously, when I tried the Swift version in an Xcode 6 beta playground, just the presence of the let fib = ... statement made it very unhappy. Beta IDE is beta.

Adding Restraints

I think with more finagling and probably by uglifying the memoize function the call site code can be simplified, but I'm actually quite content with this version. With C++11 and now 14, the language has gotten more terse. When the Concepts TS is ready, we can also restrict the T typename as such:

template <Hashable T, typename U>
auto memoize(std::function<U(T)>&& f);

Or something equivalent, I'm assuming Hashable will be a standard type restraint.

I'm also wondering if a concept like Callable could exist such that the argument type and return type are retrievable from the concept itself so that the explicit passing of the type parameters becomes unnecessary. Right now, I lack the knowledge to really investigate this.


I uploaded a simple source file to play around with the code as a gist so if I piqued your interest I encourage you to play around with it.

  1. In fact, watch all 3 Swift videos to get a good feel for the whole language from the basics to generic functional features.
  2. Not actually a fact. The level of rigorousness and overall scienceworthyness of this article is low to very low. But no one reads footnotes anyway. Yay C++!!


July 5, 2013

Through The Loop I found an article on simplicity and getting rid of a lot of physical media and clutter, aka crap, from your house that may have once served a purpose but no more.

In a fit of madness I glanced at the comments and they started off with someone blaming Oprah for brainwashing the masses to get rid of their things… Anyway, it's always fun to hear the putdowns when you decide to get rid of your crap. Words like "Spartan", "cold", "empty" and such are thrown around, the whole notion is dismissed and done!

Thing is, what you throw out is the crap in your life.
Piles of papers that have long since lost their usefulness, obsolete containers for media of all kinds, random crap you just have lying around because you don't want to think about what to do with them. If you love something dearly then it's not crap so you keep it, not that hard.

When you take the time and actually clean stuff up by eliminating the things you don't need you have a choice: leave the place emptier or put other things in there that do have value and now actually can fit where you want to put them.

Getting rid of Stuff1 is not about creating a zen temple in your house, it is about moving forward. How you move forward is up to you put don't let things that are there just because they're there slowly take over your life. Consider that if you're afraid of even thinking about cleaning up what a vice-grip your crap has on your mind.

Exactly how much Stuff you want in your life is entirely up to you, but everyone has things they don't need anymore. Take a weekend and toss 'em away and put some flowers in the place they took up. Enjoy.

  1. Be sure to watch George Carlin's take on Stuff


May 20, 2013

At the local casino the roulette tables have a big sign next to them showing the last 20 winning numbers, colour-coded black and red. It entices people to join, because with all that knowledge of past wins, your odds must be so much better! "8 black numbers in a row…, why not try your hand on red?"

In reality of course, the effect of those numbers on your chance to win will be negligible, but I am not xkcd and will not try to prove that here. A fact is though that if that sign gave people any significant advantage, it would not be there. The casino is not your friend.

Tricks like these play on people's notion of random numbers. True random is not what most people consider random at all. If I were to say I'd pick a random number between 1 and 1000000 inclusive, most people will give me a funny look when I pick 1 and not something like 838417.
My “flawed” method of picking the number is the point: people expect “random numbers” to conform to how a human mind would come up with an arbitrary number, which must be “interesting” and have multiple digits, preferably all unique.

1 -> not random
123321 -> kinda random
486137 -> random
3871264059 -> really random
1234567890 -> not random

In my first real job I worked at a company on to-order screen savers for Windows. Many of these were made up of a series of animated sequences, played continuously in random order. It wasn't long until the comments came in that the scene selection wasn't really random. A scene had played twice in a row, clearly something was wrong.

After some sniggering about this honest, but just plain wrong statement, we added the following function1:

int customerRandom(int min, int max) {
    static int lastVal = 0;
    int val = lastVal;

    while (val == lastVal)
        val = rangeRandom(min, max);
    return val;

It was a minimal solution to the problem — a scene could still occur twice in a three-scene time span — but customers all agreed that this was random enough.

A better solution would be a series of permutations of the possible items. This just so happens to be the algorithm newer Tetris games use as mandated by The Tetris Corporation a.k.a. Henk Rogers.

The original Tetris games used simple pseudo-RNG2s that did not do much, or at all, to make things easier for the player. In Ecstasy of Order they call a long period of time without a long bar a “drought”, which is often fatal for less experienced players3. The simple RNG is the cause, it's inevitable.

So Henk's team sat down one day and came up with a better way to dispense Tetris blocks, which they creatively titled the “Random Generator”. It conceptually works like this:

A. There are 7 distinct pieces in Tetris, number them 1 through 7 and put one of each in a metaphorical bag.

1 2 3 4 5 6 7

B. One by one, pick a random number out the of the bag and add it to a list that we'll call the permuted list.

1 2 3 4 5 6 7 -> 1 2 3 4 6 7 -> 1 3 4 6 7 -> 1 4 6 7 -> 1 6 7 -> 1 7 -> 7 ->

Permuted list:
-> 5 -> 5 2 -> 5 2 3 -> 5 2 3 4 -> 5 2 3 4 6 -> 5 2 3 4 6 1 -> 5 2 3 4 6 1 7

In the diagram, the arrows -> indicate a number being picked, taken out of the bag and being appended to the permuted list4.

C. Whenever the game needs a new piece, pop the first number off the permuted list and return that.

5 2 3 4 6 1 7 -> 2 3 4 6 1 7 -> 3 4 6 1 7 -> 4 6 1 7 -> 6 1 7 -> 1 7 -> 7 ->

D. If the permuted list is empty, go back to A and the sequence starts anew with another permutation of the 7 pieces.

This algorithm is simple but it guarantees equal availability of all the pieces and a maximum drought of 12 pieces. This change in how Tetris worked was of course criticised as making the game easier for n00bs.

In early builds of my Tetris clone Quadrae I used a pure RNG for piece selection and I sometimes felt it was just unfair. “4 squares in a row? You've got to be kidding me!” What was wrong with this thing?

I quickly added the Henk-approved piece selection method and the game immediately felt more balanced and dare I say, reasonable. Though the pure random piece selection certainly elicited more extreme reactions from me, this was just plain better. Is it easier? Yes, but Tetris has enough stress-inducing qualities already, getting a few guarantees makes it less extreme, not any less of a game.

Numerical sequences in games need to be not overly obvious but also not too random. Obvious repetition or "duplicates" will distract the player and can make things boring. When the player's life directly depends on the next random number though, you should add grouping or result bias to your RNG so that players will not rage-quit and trash your app.

Leave numerical honesty to Hard Mode.

  1. The real version was in Pascal because we used Delphi, but bollocks to Pascal.
  2. RNG is Random Number Generator. It's pseudo because pure software RNGs are incapable of generating true random number sequences.
  3. The Tetris championships still use the NES version of Tetris, which has a simple RNG and therefore, droughts.
  4. When looking at the permuted list, did/do you feel a sense that I could have come up with a more random permutation? I rest my case.

Writing a Modern C++ JSON Reader (part 1)

July 15, 2013

(This is part 1 of N in a series about writing a modern C++ JSON engine, the next post will be available soon.)

Does the world need another JSON parser? Perhaps not, but I made one anyway. In this series of posts I'll summarize how I did it and what I've learned.

The nice thing about a JSON engine, especially one written in a non-scripting language, is that it presents a couple of nice programming problems on a small enough scale that you can experiment with the implementation quickly:

For my implementation, named krystal, I chose to focus on the points below. I will elaborate on each in the coming posts.


I was looking at writing a simple static blog generator and first made a version that is serverless, but which generates the site largely on the client in JavaScript. This is the site you're looking at now.

Moving things to a C++ generator would require me to work with JSON data that stores e.g. site generation settings and post metadata. I needed a C++ JSON DOM parser. It wasn't long until I found the json_benchmark project that compared a couple of the popular JSON engines on parsing speed.

rapidjson is by far the fastest of them all and to get that speed it pulls no punches: it creates objects in place inside the json text, all objects are custom and use a memory pool allocator, it has SSE3 and 4 optimized white-space skipping functions, etc.

Now, I like performance, I wrote a NES emulator in PowerPC assembly. I am down with low-level stuff. Right now though, I'm trying to learn me some C++.

I came back to C++ recently and I for now I like to keep things as "pure" as possible as to not let old coding styles and habits take hold in my new projects. There's a lot of "C+" code out there, C and C++ concepts mixed together with a ton of MACROS thrown in for good measure. That's not for me, so my goal became clear:

To make a JSON parser that gets as close as possible to rapidjson speed-wise while only using standard library storage and string classes. I required C++11 to use the things like std::unordered_map, lambdas, move semantics and other new goodies.

The Result

I will start at the end and show how you might use krystal. This will set the stage and hopefully make things clearer later on.

auto file = std::ifstream{"game_data.json"};
auto doc = krystal::parse_stream(file);

if (! doc.is_null()) {
    auto delay = doc["levels"][0]["zombie spawn delay"].number();
    auto name = doc["levels"][0]["level name"].string();

    for (auto kv : doc["levels"]) {
        auto level_index = kv.first.number_as<int>();
        auto& level_data = kv.second;

parse_stream returns a document which is the root of the tree of values of the JSON data. Each value is a variant type. JSON supports 7 types of values: null, true, false, bool, number, string, array and object. The first 5 are pure values and the last 2 are containers. arrays map ints to values. objects map strings to values.

krystal allows direct indexing into arrays and objects. It also allows iteration over the contents of arrays and objects. Because krystal exposes begin and end methods for values, you can use C++11 ranged for loops. For consistency, each iteration returns a std::pair<value,value&>. The first value is the key (a number or a string) and second is a ref to the actual value.

Each value has value kind1 tests and data extraction methods. Calling the wrong data method will throw an exception.

// variant kind tests
value_kind type() const;
bool is_a(const value_kind type) const;
bool is_null() const;
bool is_false() const;
bool is_true() const;
bool is_bool() const;
bool is_number() const;
bool is_string() const;
bool is_array() const;
bool is_object() const;
bool is_container() const { return is_object() || is_array(); }

// data extractors
bool boolean();
double number();
template <typename T> T number_as() const;
std::string string();

Finally, there are methods that deal with containers. Again, calling container or array/object specific methods on values of a different kind will yield little but exceptions.

size_t size() const; // number of items in container
bool contains(const std::string& key) const; // object key check

const value& operator[](const std::string& key) const; // objects
const value& operator[](const size_t index) const; // arrays
iterator begin() const; // all iterators are const_iterators
iterator end() const;

By only having a std::string key param and no const char* alternative I avoid the gotcha present in rapidjson where the zeroth array item has to be indexed like val[size_t{0}] to avoid ambiguities with NULL. Type Win!

And on that note I will end this first part of this article. Next time I will cover the how and why of the evolution of the class design. In the mean time please have a look at the project on github and, if you have Clang 3.2 or 3.3, try it out for yourself.

  1. I first named it value_type but changed this after I moved to templated code for reasons you may understand.