Performance of write requests?

Karsten Thygesen karthy at netic.dk
Mon May 10 11:36:42 EDT 2010


Hi

I'm doing a small proof-of-concept and the goal is to store about 250.000.000 records in a Riak cluster. Today, we have the data in MySQL, but we strive for better performance and we might even expect up to 5 times as mush data during the next couple of years. The data is denormalized and "document" like so they are an easy match for NoSQL paradigm. 

For the small POC, I have built a 4 node cluster with 4 dedicated virtual servers running Opensolaris on top of VMWare but with quite fast storage below. In fron of the cluster I have a loadbalancer which will distribute reuests evenly among the nodes.

Each node is running riak-0.10 with almost deafult configuration. I have added "-smp enabled" to vm.args and each node is otherwise using default configuration (except for name of cause). This also implies N=2 and dest for storage backend.

I have written a small ruby script which uses riak-client from Ripple (latest version) as well as curd for http connections and it quite simple takes each record from the database and stores is in riak. Each record is around 500-1000 bytes large and entirely structured text/data. I store them as JSON objects.

The script can easily read more than 15.000 records/second, process them and print them to the screen, so I doubt the script is the bottleneck.

When I try to write them to the riak cluster via the loadbalancer, I can only write around 50-60 records/second and while writing, the beam process is only using  around 10% cpu and no major IO activity is going on.

I have tried to move the data directory to /tmp (memory filesystem) and with this setup, I can get around 90 write/sec (yes - only for testing - I can not live with memoryfilesystem in production with this dataset).

I have also noticed, that the performance I get is almost equivalent nomatter if I write through the loadbalancer or I just select a node and sends all my writes to that one. 

I have also tried a "multithreaded" approach where I simply run two of my datamover scripts in parallel, and that way, I can get around 110 writes/second.

With the current performance, it will take me more than a month to move my data from mysql to Riak, so I need a multitude of better performance.

Do you have any suggestions for how to get better performance? I was hoping for towards 1000 writes/second so feel free to speculate - perhaps I should just add quite a bunch of more servers?

Best regards,
Karsten
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100510/c91df597/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1919 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100510/c91df597/attachment.p7s>


More information about the riak-users mailing list