A script to check bitcask keydir sizes

Anthony Molinaro anthonym at alumni.caltech.edu
Wed Mar 23 18:24:21 EDT 2011

So a question about when to add new nodes.  I'm looking at the output of
this script and the output of riak-admin status to attempt to figure out
if it's time to grow a cluster.

I have 4 nodes 1024 partitions replication factor 3, currently with a
single bitcask single bucket where both the key and the value are 36 bytes.

According to the bitcask spreadsheet the overhead per key is 40 bytes

The current key counts/memory are

             key_counts  mem_total  mem_allocated   (key_count*76)
node1        22381785  25269010432  21015953408     1701015660
node2        22378092  25269010432  14076137472     1700734992
node3        22373770  25269010432  21565509632     1700406520
node4        22382394  25269010432  21493731328     1701061944

node2 failed at some point and was replaced with with a new node.

So there is some oddness here I don't understand.  According to the
calculated value I should see about 1.7GB per box used, instead I see
21GB on most machines except for the one which was restarted which has
14GB.  From looking at memory it seems like I should be adding some nodes
real soon or amount allocated will hit the total amount.  Or maybe there's
a memory leak which will reduce the amount of memory (as with node2)?

I'm just trying to figure out why I seem to almost be out of memory with
23 million documents when the Bitcask capacity planning spreadsheet seems
to suggest I should be able to have 282 million with 20 GiB of free Ram.



On Wed, Mar 16, 2011 at 12:04:48PM -0700, Aphyr wrote:
> I'm trying to track some basic metrics so we can plan for cluster
> capacity, monitor transfers, etc. Figured this might be of interest
> to other riak admins. Apologies if my erlang is nonidiomatic, I'm
> still learning. :)
> #!/usr/bin/env escript
> %%! -name riakstatuscheck -setcookie riak
> main([]) -> main(["riak at"]);
> main([Node]) ->
>   io:format("~w\n", [
>     lists:foldl(
>       fun({_VNode, Count}, Sum) -> Sum + Count end,
>       0,
>       rpc:call(list_to_atom(Node), riak_kv_bitcask_backend, key_counts, [])
>     )
>   ]).
> $ ./riakstatus riak at
> 18729
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Anthony Molinaro                           <anthonym at alumni.caltech.edu>

More information about the riak-users mailing list