Advices for cluster optimal configuration

Jeremiah Peschka jeremiah.peschka at gmail.com
Tue Mar 8 12:36:05 EST 2011


There are a few things that you can do to speed up your load. When you're writing your data, you can set both W and DW to 0 (as long as you have a way to check for errors). This will shave a bit of time off of each write because you'll be throwing writes against the database and hoping that they stick. You can also set the returnbody to false. Returnbody defaults to true IIRC. When returnbody enabled, Riak will return the object you wrote and also include the Riak specific info (vclock, etc). I don't care about these things when I'm doing a bulk load, so I turn that sort of thing off.

Depending on the type of querying you're doing, you can adjust the JavaScript VM settings. For example, if you aren't doing any reduce phases in your queries, then you can set the number of reduce VMs to 0. Since you're probably only doing key lookups, you can probably kill off all of the JavaScript VMs.

I suspect somebody smarter will have better input and will correct me, but that's my 2 cents worth. 
-- 
Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA
On Tuesday, March 8, 2011 at 8:34 AM, Thibault Dory wrote: 
> Hello,
> 
> I'm benchmarking various noSQL databases (see www.nosqlbenchmarking.com for current results and configurations used) for my master's thesis and I'm going to apply this benchmark on bigger clusters. Indeed for the moment I have only used a small cluster of 8 servers with a very small data set (20000 articles from Wikipedia) to conduct those tests. 
> 
> I will use up to 100 servers (2Gb, 4 CPU, 80Gb hdd) from the Rackspace cloud and the new data set is the entire English version of Wikipedia. Each article is store as a single document with a unique ID based on a integer, you can see the implementation here : https://github.com/toflames/Wikipedia-noSQL-Benchmark/blob/master/src/implementations/riakDB.java and the benchmark methodology here : http://www.slideshare.net/ThibaultDory/a-new-methodology-for-large 
> 
> I would like to know if some of you have advice on how I could take the best out of Riak for this specific use case and on this kind of server. For example I would like to know if there are some memory/cache tunings that could be useful to match this server size. 
> 
> Any other input or critic is welcome,
> 
> Thank you,
> 
> 
> Thibault Dory 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110308/60305b42/attachment.html>


More information about the riak-users mailing list