why innodb?

David Smith dizzyd at basho.com
Fri Feb 12 08:56:44 EST 2010

On Fri, Feb 12, 2010 at 5:12 AM, Richard Bucker <richard at bucker.net> wrote:

> With the many fully open and free daatbases out there I was wondering why
> the team selected innodb? If memory serves there are some tools, like
> hotbackup/restore, that are not free or included. Also, while there is a
> innodb/erlang library it would seem that there is an impedance mismatch
> there.

Hi Richard,

The major drivers for choosing Embedded Inno over other storage libraries
available right now were 1. predictability and 2. stability.

For Riak's purposes, we need something that is going to have predictable
latency under significant loads. After evaluating TokyoCabinent (TC),
BerkeleyDB-C (BDB) and Embedded Inno, it was quite clear that Inno won this
aspect hands down. TC was quite fast until the dataset gets large and then
write latency goes through the roof. BDB-C has an excellent average latency,
but the 95th+ percentile latencies were highly variable (I saw 95th
percentile times > 15 seconds); there were also some pretty icky threading
bugs in the MVCC subsystem of BDB-C when using a lot of parallelism.
Embedded Inno had none of these problems and performs better than the other
two in all the test scenarios that I was able to come up with.

In terms of stability, Embedded Inno did require a few patches (which can be
found on the Embedded Inno forums). However, with those patches in place,
we've not had any problems with long running deployments and significant
parallelism (i.e. 512+ threads). More importantly, when bugs have been found
in Inno, the team on the support forums is quite responsive in answering
questions. Contrast this with the BDB forums where when we found a bug with
MVCC subsystem it took a few months for anyone to get back to us on what the
problem _might_ be.

To be honest, I originally thought that BDB-C would be the ticket. It does
have a better set of support tools and a great reputation. But when matched
up against the requirements for a distributed k/v store and when it was
MEASURED, it fell short. Embedded Inno has won me over pretty completely
with fast performance (even for an unusual use case of storing only BLOBs),
solid behaviour and good support forums.

Hope that helps,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100212/acdb8233/attachment.html>

More information about the riak-users mailing list