Store whole database in memory
nico.meyer at adition.com
Sun May 29 05:20:31 EDT 2011
Greg's advice is probably the best, if you really always want to read
back or update predefined groups of 1000 keys at once. It will increase
the rate at which you can write and read by a factor of 1000 ;-).
But if that's not what you want to do, and we really don't know what
your design goals are, I honestly think you are trying to put a screw in
with a hammer here. Maybe you should look for alternatives to Riak,
since you exploit all its weaknesses an don't care about most of its
Namely storing very small values is a weak spot, as Greg mentioned.
There is an overhead of at least around 400 bytes per entry at the
moment. Even if there are plans to reduce this overhead, I would
estimate it will never get below around 100 bytes if the thing is still
Riak afterwards. Also this overhead exists for all storage backends. So
with the ets backend you will only be able to store about 2-3 million
entries per GB of RAM right now.
Which brings me to the part where you don't care/use most of Riak's
strengths. You don't seem to care about persistence of data, otherwise
you wouldn't use a memory only backend. (Btw, as Mike pointed out, with
enough RAM bitcask is essentially a memory store, especially where the
write performance is concerned)
You also don't care about eventual consistency, evidenced by the fact
that you do bulk inserts (only?), and that 12 bytes wouldn't allow for
enough information to resolve conflicts. So you probably want a last
write wins behaviour (which can be set as a bucket property in Riak, but
kind of defeats the purpose in my opinion).
But lets assume Riak was the right tool for your job for a moment.
The limiting factor for writing your data is almost certainly not the
disk. Writing 100,000 keys with a size of 12 bytes requires only about
1MB/s, so event the crappiest disk should have no problem with that. But
as I said there is quite a large overhead for storing values in Riak, so
in reality the required rate will be 50MB/s per node (3 nodes, n=3
presumably). Still not a big deal, and this only is a limiting factor
once the filesystem cache uses all available RAM.
On the other hand, network latency is a problem at such high rates, even
in a LAN. As far as my experience an my short Google research tell me,
that the lowest roundtrip time you can expect on standard Gigabit
ethernet is on the order of 0.1msec or 1/10000 second. For each
operation you need at least one roundtrip (one request packet, one
reponse packet), so that means with one connection you can never go
beyond 10,000 writes per second. This assumes no processing time
whatsoever, so a more realistic number is 2000-5000 ops/s. Therefore you
need at least 20-50 parallel connections or clients to achieve your
target write rate. If you use the Rest API these numbers need to be
doubled, since one additional roundtrip is already need to set up the
In general without a lot of tuning and maybe specialized hardware
(multiple NICs or special low latency NICs) any server will have a hard
time to handle 100,000 ops/s, regardless of the software that is used.
On 28.05.2011 20:36, Greg Nelson wrote:
> Depending on the n_val you have set for that bucket, Riak will store the
> objects n times on n different nodes. There are two other parameters you
> should know about, r and w. When writing, Riak will wait for w of the n
> nodes to finish the write before returning. When reading, Riak will wait
> for r of the n nodes to respond before returning. This is the basics of
> how Riak does fault and partition tolerance, i.e. if one node is down
> your cluster still functions, and the r and w vals define a sort of
> "majority vote" threshold to handle a split-brain problem.
> Anyway, for your purposes you could set w=1 and r=3 for faster writes at
> the expense of potentially slower reads. I've never tried this (or any
> of the backends besides bitcask) so I don't know what you should expect.
> As for bulk insert and preserving locality, I don't know of a way to do
> that with Riak except to batch your 1000 keys into a single object,
> identified by one key. As far as Riak is concerned, it's just a 12KB
> opaque object, which your application would need to always write and
> read all at once.
> If you don't batch like that, you should look for a discussion on this
> mailing list from last week regarding capacity planning and very small
> objects. There's a bit of overhead associated with each object that will
> be significant for objects as small as 12 bytes. You could skip over the
> parts about Bitcask overhead...
> On Saturday, May 28, 2011 at 9:59 AM, Michael McClain wrote:
>> Thank you, Mike and Greg, for the response.
>> I've just replied to the list.
>> In my use case, I need to be able to write 100,000 keys per second.
>> Where the key is very small (12 bytes). And I always insert 1000 keys
>> at once, in a bulk insert. I would also like to preserve the locality
>> of the keys inserted at once (so that they stay always in the same
>> node). Do you know if that is possible?
>> Thank you
>> 2011/5/28 Mike Oxford <moxford at gmail.com <mailto:moxford at gmail.com>>
>>> With enough RAM you could just have it keep the whole thing in
>>> On Fri, May 27, 2011 at 11:11 PM, Greg Nelson <grourk at dropcam.com
>>> <mailto:grourk at dropcam.com>> wrote:
>>>> You might want to check out riak_kv_ets_backend,
>>>> riak_kv_gb_trees_backend, and riak_kv_cache_backend.
>>>> On Friday, May 27, 2011 at 10:35 PM, Michael McClain wrote:
>>>>> Is it possible to store the whole database in memory?
>>>>> In a similar way as Redis does.
>>>>> I'm really interested in the distributed map reduce done by riak
>>>>> ("bring processing to the data, instead of data to processors), but
>>>>> I need faster writes/reads that a memory-only database could provide.
>>>>> In case you don't support memory-only storage (no disk touched /
>>>>> all keys and data fitting the memory in all nodes) yet, do you plan
>>>>> on implementing it?
>>>>> Thank you,
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>> riak-users mailing list
>> riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users