Throughput issue contd. On Joyend Riak Smartmachine

Russell Brown russell.brown at mac.com
Wed Jun 27 06:54:39 EDT 2012


On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:

> Its not about the difference in throughput in the two approaches I took. Rather, the issue is that even 200 writes/sec is a bit on the lower side.
> I could be doing something wrong with the configuration because people are reporting throughputs of 2-3k ops/sec
> 
> If anyone here could guide me in setting up a cluster which would give such kind of throughput.

To get the kind of throughput I use multiple threads / workers. Have you looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak clusters?

Cheers

Russell

[1] Basho Bench - https://github.com/basho/basho_bench and http://wiki.basho.com/Benchmarking.html

> 
> Thanks,
> Yousuf
> 
> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson <anderson at copperegg.com> wrote:
> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan <yousuffauzan at gmail.com> wrote:
> 
>> Hi,
>> 
>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak OpenSource SmartMachine Image.
>> 
>> Afterwards I tried loading data by following two methods
>> 1. Bash script
>> #!/bin/bash
>> echo $(date)
>> for (( c=1; c<=1000; c++ ))
>> do
>> 	curl -s -d 'this is a test' -H "Content-Type: text/plain" http://127.0.0.1:8098/buckets/test/keys
>> done
>> echo $(date)
>> 
>> 2. Python Riak Client
>> c=riak.RiakClient("10.112.2.185") 
>> b=c.bucket("test")
>> for i in xrange(10000):o=b.new(str(i), str(i)).store()
>> 
>> For case 1, throughput was 25 writes/sec
>> For case 2, throughput was 200 writes/sec
>> 
>> Maybe I am making a fundamental mistake somewhere. I tried the above two scripts on EC2 clusters too and still got the same performance.
>> 
>> Please, someone help
> 
> 
> The major difference between these two is the first is executing a binary, which has to basically create everything (connection, payload, etc) every time through the loop.  The second does not - it creates the client once, then iterates over it keeping the same client and presumably the same connection as well.  That makes a huge difference.
> 
> I would not use curl to do performance testing.  What you probably want is something like your python script that will work on many threads/processes at once (or fire them up many times).
> 
> 
> Eric Anderson
> Co-Founder
> CopperEgg
> 
> 
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120627/bf456360/attachment.html>


More information about the riak-users mailing list