Advices for cluster optimal configuration

Thibault Dory dory.thibault at gmail.com
Tue Mar 8 11:34:48 EST 2011


Hello,

I'm benchmarking various noSQL databases (see www.nosqlbenchmarking.com for
current results and configurations used) for my master's thesis and I'm
going to apply this benchmark on bigger clusters. Indeed for the moment I
have only used a small cluster of 8 servers with a very small data set
(20000 articles from Wikipedia) to conduct those tests.

I will use up to 100 servers (2Gb, 4 CPU, 80Gb hdd) from the Rackspace cloud
and the new data set is the entire English version of Wikipedia. Each
article is store as a single document with a unique ID based on a integer,
you can see the implementation here :
https://github.com/toflames/Wikipedia-noSQL-Benchmark/blob/master/src/implementations/riakDB.java
and
the benchmark methodology here :
http://www.slideshare.net/ThibaultDory/a-new-methodology-for-large

I would like to know if some of you have advice on how I could take the best
out of Riak for this specific use case and on this kind of server. For
example I would like to know if there are some memory/cache tunings that
could be useful to match this server size.

Any other input or critic is welcome,

Thank you,


Thibault Dory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110308/55d7b7ff/attachment.html>


More information about the riak-users mailing list