bitcask and innostore overheads
colin.surprenant at gmail.com
Mon Oct 4 15:20:49 EDT 2010
I just got this OMG moment after reading Sean's comment on Innostore's
buffer pool invalidation when using random patterns for the key space.
I am at the point where the relative poor write performance of my
Riak setup has started to bite me. I am using Innostore and not using
Bitcask because of my huge & growing keys volume.
Now, this is not a rigorous benchmark but, in my staging environment,
which uses a single node "large" EC2 instance (4 EC2 compute units,
7.5GB ram) I was using MD5 style hash keys and my Riak insertion rate
was about 40-60 items per second (using 5 writer threads over the REST
api) and the load average on my system was getting very high, around
After seing this comment, I changed my key format to use a simple
increasing integer number and, bingo, my insertion rate increased
approx 10 fold, with a negligible impact of the system load.
I think it would be worth point this out in the doc somewhere. This
very simple fact does have a *huge* impact on Innostore's performance.
On Tue, Jul 6, 2010 at 4:18 PM, Sean Cribbs <sean at basho.com> wrote:
> I'm glad to see you're still looking at Riak.
> Regarding your bitcask question, that does seem to be in the correct range of sizes. Dave (@dizzyco) tells me the actual figure is 24 bytes + the hashtable overhead.
> Inno does pad things to fixed-size pages, so yes, you could end up with wasted disk. However, I would suspect the greater concern would be excessive invalidation of the buffer pool from the essentially random/uniform shape of your key-space, making it difficult to get good throughput. Inno works best when keys are inserted in sequential order.
> Sean Cribbs <sean at basho.com>
> Developer Advocate
> Basho Technologies, Inc.
> On Jul 6, 2010, at 3:36 PM, Jeremy Hinegardner wrote:
>> Hi all,
>> I am doing some sizing estimates for a possible transition to riak of
>> our document store. I've mentioned it before on this list before and
>> in #riak and this is a snippet of a conversation I had with @seancribbs:
>> I have also reviewed the bitcask-intro.pdf and http://gist.github.com/438065
>> Quick and dirty info, I am looking to store billions of documents, starting
>> with 2 billion initially, and a linear growth of around 10 million per day.
>> The key is a 64bit number as a string (generaly about 20 bytes) and the value is
>> a text/xml document of an average size of 1.5KiB. This size long tails out to
>> maybe 5 MiB.
>> Our system is write once. A key/value pair should never be overwritten once it
>> is initially inserted, and it is accessed fairly often for about a day, and then
>> a long tail drop off. The pair must be available for retrieval at any time.
>> == Bitcask ==
>> I went into the source of bitcask to confirm the 32 bytes per key minimum
>> memory requirements mentioned in http://gist.github.com/438065 and turned
>> If my calculations are correct, the actual memory overhead, per key using
>> bitcask is 72+N bytes on a 64bit system:
>> UT_hash_handle -> 50 bytes, (6 pointers and 2 chars)
>> file_id -> 4 bytes,
>> total_sz -> 4 bytes,
>> offset -> 8 bytes,
>> tstamp -> 4 bytes,
>> key_sz -> 2 bytes,
>> key -> N bytes - how big is this? is this the riak key,
>> or a hash of the riak key?
>> This adds up to 72 bytes + the size of the key, per key/value in bitcask.
>> If I assume that the key is 20 bytes, then we are talking 92 bytes of memory
>> overhead per document. That means I can store, ~11 million documents per GiB of
>> free memory (1024^3 / 92), Or, if I have 32GiB of free ram on a machine
>> to dedicate to riak w/bitcask (the rest would be used for diskcache) I
>> can store ~373 Million documents.
>> Are my calculations correct?
>> It also does not look like bitcask pads values on disk, so there is no wasted
>> disk space. Is this correct?
>> == Innostore ==
>> For Innostore I'm not so worried about the memory overhead as insertion overhead
>> and wasted disk space.
>> Since InnoDB stores data in key order and our keys are esssentially random 64bit
>> numbers as strings, are we going to have a significant overhead in our
>> Using innostore, will there be any key/value padding on the data which
>> would cause an overhead per row of disk usage?
>> Also, we currently compress the data on on disk, and I would interested in
>> hearing how the compression of disk pages with innostore works.
>> Jeremy Hinegardner jeremy at hinegardner.org
>> riak-users mailing list
>> riak-users at lists.basho.com
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users