Down the NoSQL Rabbit Hole

Although I’m going back to work in a little over a week, I’ve been working on an iPhone game in the meantime. More on that some other time (soon!) This game has an online turn-based component, kind of like Scrabble (Also Words With Friends), Carcassone, Disc Drivin’ and other great iPhone games. So besides the iPhone front end, I’m writing a Django back-end since I’m already familiar with Django and it’s nice for getting things done cleanly and quickly.

Of course, the back end needs a back end too, so rather than just shoving everything in MySQL, I decided I’d try out newer, more scalable, stuff just in case I happen to write the next Angry Birds. Am I likely to need more than a single server MySQL instance? Nah. But it’s a good learning experience even if I never grow beyond one virtual machine serving everything. And if I need to, I’ll be ready to in a hurry.

At first, I was using Amazon’s SimpleDB and the boto package for Python. It was working great! Early on I’d made a brief attempt to verify that it could hold enough data per row to do what I was doing. But I failed to Google the right thing and so just put it off until later. I wrote everything such that I could replace it with a different storage engine pretty easily.

Then a few days ago I implemented chat. That worked great too! Until the first time I added a chat line that pushed the total chat for one game over 1K bytes. BOOM, that’s an end of the world event for SimpleDB. It turns out it has a limit of 1K bytes for any individual piece of data, and also a limit of 256 keys per row. I am also saving the complete game histories and moves for each player, three keys per turn, though it could be squished down to one key per turn as each turn is completed. I had a plan to work around the 256 keys limit, because I thought a limit like that might come up, but the 1K limit was harder. I could’ve still come up with something, either offloading the chat to Amazon S3 (that’s how they really want you to do things anyway), or breaking it up into chunks somehow, but none of that was very appealing.

So I googled something like “distributed database software” and found Cassandra at the first hit. Oh yeah, Cassandra, I’d looked at it before but never used it for anything. It doesn’t have any meaningful limits on size for my purposes (there are limits, but they’re measured in billions). It has a Python binding. This will be fun! I downloaded it, pretty easily got it up and running, and started rewriting my storage class. At the point where I was saving games, I realized I had a race condition with both players trying to save at the same time. That would be quite common, the turns in this game are simultaneously executed and if both players are online at the same time, they will try to submit a save within seconds of each other, depending how I have the polling set. To be fair, I had the same race condition with SimpleDB, and was aware of it, but I hadn’t even gotten around to looking at how to fix it.

Next, I Google “Cassandra row locking”. There isn’t any. Nice as it would be, it’s understandably beyond the Cassandra project’s scope. But I did find someone talking about using something called ZooKeeper to implement a distributed locking system on top of Cassandra. Look around, see there’s a python binding for ZooKeeper, ok, I should be in business, let’s get it going

A little while later, I’ve got a ZooKeeper instance up and running. Go back to that link, figure out he was actually talking about using something called Cages, which uses ZooKeeper itself, but isn’t part of ZooKeeper. Cages is a Java library. There is not a Python equivalent. The python binding for ZooKeeper just wraps the C API, and while it’s possible to implement locking using ZooKeeper, it doesn’t provide the simple mutex/critical section type lock I’m locking for out of the box.

At this point I’m starting to question whether I really wanted to try out all this stuff or whether I should just hang up the towel and put everything in MySQL after all. But no, I’ve come this far, I’m not stopping now!

Poking around Cages a little bit, I find it’s quite large, very Java-y, and implements a whole bunch of stuff I don’t really need (at least not right now). After some more googling, I found some simpler ZooKeeper lock implementations, still all in Java, but managed to muddle through them enough to understand how it can be used to implement my simple little lock. This ClusterMutex was especially helpful.

So finally, I wound up implementing my own ZooKeeper lock class for Python. Which also caused me to create a github account to have somewhere public to put it, because if it’s useful to me then hopefully it will be useful to someone else too. And also a PyPI account so I could upload it as a Python pip (or easy_install) package. So now anyone can get the zklock package and start using it as easily as “pip install zklock” (Well, sort of, it needs the zkpython package also, which in turns the ZooKeeper C library installed first. And of course, you need a ZooKeeper instance somewhere to use any of that.) Which is all great, I haven’t contributed any open source software anywhere for a long time. I’m happy to have an excuse to set up those accounts.

Now inserting chat messages is this simple:

def append_text(game_id, key, text):
    # Lock the game
    l = zklock.Lock(item_name(game_id))
        # See if the text key exists already
        cols = game_fam.get(item_name(game_id), [key])
        existing = cols[key]
    except pycassa.NotFoundException:
        # Nope, start with a blank one
        existing = ''
    # Create the entry
    new = {key:existing + text}
    game_fam.insert(item_name(game_id), new)
    # Release the game lock
    # Return the new column
    return new

All of this is basically so I can have chat in my silly iPhone game. It’s completely overkill (though it would be nice if it turned out not to be!), and there are a thousand other ways I could’ve gone about implementing this, many of them much simpler, but I learned a bunch of new stuff along the way, and hopefully created something someone else will find useful. A worthwhile day and a half’s work for sure.

Tags: , ,

Thursday, September 8th, 2011 Uncategorized

1 Comment to Down the NoSQL Rabbit Hole

  • Juraj Sottnik says:

    Thank You! This is exactly what i need and it is working perfectly. I tried to do similar thing with python multiprocessing Manager used to share instances of Lock and listening on network interface. But i had problem with deadlock if client is disconnected before releasing lock. Since there are not simple way how to handle client disconnect in Manager i stopped and looked at zklock and learned new stuff.

  • Leave a Reply to Juraj Sottnik