Throughput issue contd. On Joyend Riak Smartmachine

Yousuf Fauzan yousuffauzan at gmail.com
Wed Jun 27 07:09:22 EDT 2012


I used examples/riakc_pb.config

{mode, max}.

{duration, 10}.

{concurrent, 1}.

{driver, basho_bench_driver_riakc_pb}.

{key_generator, {int_to_bin, {uniform_int, 10000}}}.

{value_generator, {fixed_bin, 10000}}.

{riakc_pb_ips, [{<IP of one of the nodes>}]}.

{riakc_pb_replies, 1}.

{operations, [{get, 1}, {update, 1}]}.


On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown <russell.brown at mac.com>wrote:

>
> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote:
>
> I did use basho bench on my clusters. It should throughput of around 150
>
>
> Could you share the config you used, please?
>
>
> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown <russell.brown at mac.com>wrote:
>
>>
>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:
>>
>> Its not about the difference in throughput in the two approaches I took.
>> Rather, the issue is that even 200 writes/sec is a bit on the lower side.
>> I could be doing something wrong with the configuration because people
>> are reporting throughputs of 2-3k ops/sec
>>
>> If anyone here could guide me in setting up a cluster which would give
>> such kind of throughput.
>>
>>
>> To get the kind of throughput I use multiple threads / workers. Have you
>> looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak
>> clusters?
>>
>> Cheers
>>
>> Russell
>>
>> [1] Basho Bench - https://github.com/basho/basho_bench and
>> http://wiki.basho.com/Benchmarking.html
>>
>>
>> Thanks,
>> Yousuf
>>
>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson <anderson at copperegg.com>wrote:
>>
>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan <yousuffauzan at gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak
>>> OpenSource SmartMachine Image.
>>>
>>> Afterwards I tried loading data by following two methods
>>> 1. Bash script
>>> #!/bin/bash
>>> echo $(date)
>>> for (( c=1; c<=1000; c++ ))
>>> do
>>> curl -s -d 'this is a test' -H "Content-Type: text/plain"
>>> http://127.0.0.1:8098/buckets/test/keys
>>> done
>>> echo $(date)
>>>
>>> 2. Python Riak Client
>>> c=riak.RiakClient("10.112.2.185")
>>> b=c.bucket("test")
>>> for i in xrange(10000):o=b.new(str(i), str(i)).store()
>>>
>>> For case 1, throughput was 25 writes/sec
>>> For case 2, throughput was 200 writes/sec
>>>
>>> Maybe I am making a fundamental mistake somewhere. I tried the above two
>>> scripts on EC2 clusters too and still got the same performance.
>>>
>>> Please, someone help
>>>
>>>
>>>
>>> The major difference between these two is the first is executing a
>>> binary, which has to basically create everything (connection, payload, etc)
>>> every time through the loop.  The second does not - it creates the client
>>> once, then iterates over it keeping the same client and presumably the same
>>> connection as well.  That makes a huge difference.
>>>
>>> I would not use curl to do performance testing.  What you probably want
>>> is something like your python script that will work on many
>>> threads/processes at once (or fire them up many times).
>>>
>>>
>>> Eric Anderson
>>> Co-Founder
>>> CopperEgg
>>>
>>>
>>>
>>>
>>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120627/5ffb7534/attachment.html>


More information about the riak-users mailing list