bitcask and innostore overheads

Jeremy Hinegardner jeremy at
Tue Jul 6 15:36:58 EDT 2010

Hi all,

I am doing some sizing estimates for a possible transition to riak of
our document store.  I've mentioned it before on this list before and
in #riak and this is a snippet of a conversation I had with @seancribbs:

I have also reviewed the bitcask-intro.pdf and

Quick and dirty info, I am looking to store billions of documents, starting
with 2 billion initially, and a linear growth of around 10 million per day.

The key is a 64bit number as a string (generaly about 20 bytes) and the value is
a text/xml document of an average size of 1.5KiB.  This size long tails out to
maybe 5 MiB.  

Our system is write once. A key/value pair should never be overwritten once it
is initially inserted, and it is accessed fairly often for about a day, and then
a long tail drop off.  The pair must be available for retrieval at any time.

== Bitcask ==

I went into the source of bitcask to confirm the 32 bytes per key minimum
memory requirements mentioned in and turned
If my calculations are correct, the actual memory overhead, per key using 
bitcask is 72+N bytes on a 64bit system:

    UT_hash_handle ->  50 bytes, (6 pointers and 2 chars)
    file_id        ->   4 bytes,
    total_sz       ->   4 bytes,
    offset         ->   8 bytes,
    tstamp         ->   4 bytes,
    key_sz         ->   2 bytes,
    key            ->   N bytes - how big is this?  is this the riak key, 
                                  or a hash of the riak key?

This adds up to 72 bytes + the size of the key, per key/value in bitcask.

If I assume that the key is 20 bytes, then we are talking 92 bytes of memory
overhead per document. That means I can store, ~11 million documents per GiB of
free memory (1024^3 / 92),  Or, if I have 32GiB of free ram on a machine
to dedicate to riak w/bitcask (the rest would be used for diskcache) I
can store ~373 Million documents.

Are my calculations correct?

It also does not look like bitcask pads values on disk, so there is no wasted
disk space.  Is this correct?

== Innostore ==

For Innostore I'm not so worried about the memory overhead as insertion overhead
and wasted disk space.

Since InnoDB stores data in key order and our keys are esssentially random 64bit
numbers as strings, are we going to have a significant overhead in our

Using innostore, will there be any key/value padding on the data which
would cause an overhead per row of disk usage?

Also, we currently compress the data on on disk, and I would interested in 
hearing how the compression of disk pages with innostore works.



 Jeremy Hinegardner                              jeremy at 

More information about the riak-users mailing list