Performance of write requests?

wde wde at
Mon May 10 19:23:50 EDT 2010

do you have reference values by using an "in memory" storage backend for example, in order to clarify that the performance limit is related to the disk backend ?


>A couple of quick questions for you Karsten that should help us get an idea
>of what kind of issues you might be having.
>How many physical hosts are you running the four OpenSolaris virtuals on?
> If they're all running on the same host and you don't have a pretty
>substantial RAID array backing their local storage, you're just going to get
>I/O contention between the virtuals, slowing down writes.
>There are some ZFS tuning parameters we've found that can improve write
>throughput.  Since you're using dets there's one in particular that will be
>helpful.  You can run this command as root on each OpenSolaris virtual:
>zfs atime=off <pool>
>The fact that you can essentially double your performance by running another
>client in parallel does make me wonder whether or not it might be a mild
>performance issue with your invocation of the ripple client.  Do you see a
>linear increase in write performance as you increase the number of parallel
>On Mon, May 10, 2010 at 8:36 AM, Karsten Thygesen <karthy at> wrote:
>> Hi
>> I'm doing a small proof-of-concept and the goal is to store about
>> 250.000.000 records in a Riak cluster. Today, we have the data in MySQL, but
>> we strive for better performance and we might even expect up to 5 times as
>> mush data during the next couple of years. The data is denormalized and
>> "document" like so they are an easy match for NoSQL paradigm.
>> For the small POC, I have built a 4 node cluster with 4 dedicated virtual
>> servers running Opensolaris on top of VMWare but with quite fast storage
>> below. In fron of the cluster I have a loadbalancer which will distribute
>> reuests evenly among the nodes.
>> Each node is running riak-0.10 with almost deafult configuration. I have
>> added "-smp enabled" to vm.args and each node is otherwise using default
>> configuration (except for name of cause). This also implies N=2 and dest for
>> storage backend.
>> I have written a small ruby script which uses riak-client from Ripple
>> (latest version) as well as curd for http connections and it quite simple
>> takes each record from the database and stores is in riak. Each record is
>> around 500-1000 bytes large and entirely structured text/data. I store them
>> as JSON objects.
>> The script can easily read more than 15.000 records/second, process them
>> and print them to the screen, so I doubt the script is the bottleneck.
>> When I try to write them to the riak cluster via the loadbalancer, I can
>> only write around 50-60 records/second and while writing, the beam process
>> is only using  around 10% cpu and no major IO activity is going on.
>> I have tried to move the data directory to /tmp (memory filesystem) and
>> with this setup, I can get around 90 write/sec (yes - only for testing - I
>> can not live with memoryfilesystem in production with this dataset).
>> I have also noticed, that the performance I get is almost equivalent
>> nomatter if I write through the loadbalancer or I just select a node and
>> sends all my writes to that one.
>> I have also tried a "multithreaded" approach where I simply run two of my
>> datamover scripts in parallel, and that way, I can get around 110
>> writes/second.
>> With the current performance, it will take me more than a month to move my
>> data from mysql to Riak, so I need a multitude of better performance.
>> Do you have any suggestions for how to get better performance? I was hoping
>> for towards 1000 writes/second so feel free to speculate - perhaps I should
>> just add quite a bunch of more servers?
>> Best regards,
>> *Karsten*
>> _______________________________________________
>> riak-users mailing list
>> riak-users at
>riak-users mailing list
>riak-users at

More information about the riak-users mailing list