sean at basho.com
Fri Oct 8 11:16:54 EDT 2010
Your assumption is correct - each node only has the keys that it stores replicas of.
Keep in mind that memory is used for other things than keys; the estimate given in that capacity planning spreadsheet is a baseline. In general, you'll want a decent amount above Riak's resident size for filesystem cache (which will boost performance for hot keys). For reference, we've found a 5 node cluster, 256 partitions, with 30 million keys stored (and N=3) will consume around 1.6GB per node, and then easily fill up the rest of RAM with filesystem cache when under load.
With respect to your capacity planning, you'll generally get best performance if the number of nodes exceeds the N value, since this decreases the percentage of the cluster involved in each request. For example, at cluster size 3, every node will be involved in almost all requests. At 4 nodes, this decreases to roughly 75%. At 6, it becomes about 50%.
Sean Cribbs <sean at basho.com>
Basho Technologies, Inc.
On Oct 8, 2010, at 10:11 AM, Tony Novak wrote:
> Quick question about Riak's memory footprint: I know that Bitcask
> requires all keys to fit in memory. Does this mean only the keys that
> reside on a given node, or does each node hold the keys of the entire
> system? I assume it's the former, but I just want to be sure.
> Basically what I'm trying to figure out is whether we should expect
> better performance from a cluster of N nodes with 2*M GB of memory
> each, or 2*N nodes with M GB each.
> Tony Novak
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users