Performance of write requests?

Ryan Tilder rtilder at
Mon May 10 17:46:00 EDT 2010

A couple of quick questions for you Karsten that should help us get an idea
of what kind of issues you might be having.

How many physical hosts are you running the four OpenSolaris virtuals on?
 If they're all running on the same host and you don't have a pretty
substantial RAID array backing their local storage, you're just going to get
I/O contention between the virtuals, slowing down writes.

There are some ZFS tuning parameters we've found that can improve write
throughput.  Since you're using dets there's one in particular that will be
helpful.  You can run this command as root on each OpenSolaris virtual:

zfs atime=off <pool>

The fact that you can essentially double your performance by running another
client in parallel does make me wonder whether or not it might be a mild
performance issue with your invocation of the ripple client.  Do you see a
linear increase in write performance as you increase the number of parallel


On Mon, May 10, 2010 at 8:36 AM, Karsten Thygesen <karthy at> wrote:

> Hi
> I'm doing a small proof-of-concept and the goal is to store about
> 250.000.000 records in a Riak cluster. Today, we have the data in MySQL, but
> we strive for better performance and we might even expect up to 5 times as
> mush data during the next couple of years. The data is denormalized and
> "document" like so they are an easy match for NoSQL paradigm.
> For the small POC, I have built a 4 node cluster with 4 dedicated virtual
> servers running Opensolaris on top of VMWare but with quite fast storage
> below. In fron of the cluster I have a loadbalancer which will distribute
> reuests evenly among the nodes.
> Each node is running riak-0.10 with almost deafult configuration. I have
> added "-smp enabled" to vm.args and each node is otherwise using default
> configuration (except for name of cause). This also implies N=2 and dest for
> storage backend.
> I have written a small ruby script which uses riak-client from Ripple
> (latest version) as well as curd for http connections and it quite simple
> takes each record from the database and stores is in riak. Each record is
> around 500-1000 bytes large and entirely structured text/data. I store them
> as JSON objects.
> The script can easily read more than 15.000 records/second, process them
> and print them to the screen, so I doubt the script is the bottleneck.
> When I try to write them to the riak cluster via the loadbalancer, I can
> only write around 50-60 records/second and while writing, the beam process
> is only using  around 10% cpu and no major IO activity is going on.
> I have tried to move the data directory to /tmp (memory filesystem) and
> with this setup, I can get around 90 write/sec (yes - only for testing - I
> can not live with memoryfilesystem in production with this dataset).
> I have also noticed, that the performance I get is almost equivalent
> nomatter if I write through the loadbalancer or I just select a node and
> sends all my writes to that one.
> I have also tried a "multithreaded" approach where I simply run two of my
> datamover scripts in parallel, and that way, I can get around 110
> writes/second.
> With the current performance, it will take me more than a month to move my
> data from mysql to Riak, so I need a multitude of better performance.
> Do you have any suggestions for how to get better performance? I was hoping
> for towards 1000 writes/second so feel free to speculate - perhaps I should
> just add quite a bunch of more servers?
> Best regards,
> *Karsten*
> _______________________________________________
> riak-users mailing list
> riak-users at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list