Speed of linkwalking

Mark Steele msteele at beringmedia.com
Tue Aug 30 15:37:44 EDT 2011


A bit more digging has shed some light on the issue I'm seeing. I am using a
custom built protocol buffer client, and it would appear that Riak is
sending back more than 1 response for the map reduce job, which seems odd to
me.

The first response contains

   - The expected content payload
   - Has the "has_done" property set to false
   - Has the "done" property set to false
   - Has the "has_response" property set to true

The second response contains

   - No data payload
   - "has_done" property set to true
   - "done" property set to true
   - "has_response" property set to false

I would have expected only one response, with all three properties set to
true. At any rate, I'll look a bit more at my API to figure out why that
second response is slowing things down.

At any rate, if I manually short circuit out of my response fetching the
actual response time of the first response packet received is is much closer
to the double fetch scenario (it's 0.002245 seconds, twice as slow as a
single fetch, but much faster). I haven't yet poked around to see if it's my
code that's slow or Riak that's slow in sending the second response yet. I
would have to hazard a guess and say it's Riak (but that's just a guess). I
would further suppose that Riak has to coordinate between all nodes (even
when there's one) for mapreduce, and the delay in the second packet is
related to that coordination. Again, just a guess.

Cheers,

Mark Steele
Bering Media Inc.

On Tue, Aug 30, 2011 at 3:06 PM, Mark Steele <msteele at beringmedia.com>wrote:

> I'm actually testing with an object that has only one link as that's my
> use-case.
>
> In testing, 2 separate gets is way faster than using a map-reduce
> link-walk, which is disappointing :(
>
> I'm also testing on a one node cluster, and my bucket has a N/R/W=1. I was
> thinking that the reduction in network hops would outweigh the map reduce
> overhead, but apparently that's not the case.
>
> Simple get (one value) : 0.000741
> Multiget linkwalk (get one object, then link walk in php) : 0.001567
> Linkwalk (map reduce): 0.041994
>
> The map phase does not significantly change anything as far as performance.
>
> In case it matters, I'm testing against the latest riak code on github.
>
> Cheers,
>
> Mark Steele
> Bering Media Inc.
>
>  On Tue, Aug 30, 2011 at 3:00 PM, Jonathan Langevin <
> jlangevin at loomlearning.com> wrote:
>
>> Good catch Kev.
>>
>> Mark, If you run the same operation with map removed, what is the
>> performance at that point?*
>>
>>  <http://www.loomlearning.com/>
>>  Jonathan Langevin
>> Systems Administrator
>> Loom Inc.
>> Wilmington, NC: (910) 241-0433 - jlangevin at loomlearning.com -
>> www.loomlearning.com - Skype: intel352
>> *
>>
>>
>>
>> On Tue, Aug 30, 2011 at 2:45 PM, Kev Burns <kevburnsjr at gmail.com> wrote:
>>
>>> Mark,
>>>
>>> That's not just a link walk, you're also performing a map operation
>>> there.
>>>
>>> $client->add($bucketname, 'linkkey1')->
>>>     link()->
>>>     map(array("riak_kv_mapreduce", "map_object_value")) ->
>>>     run();
>>>
>>>
>>> If the expected number of returned objects is small, performing the map
>>> phase in PHP may be faster.
>>>
>>> - Kev
>>> c: +001 (650) 521-7791
>>>
>>>
>>> On Tue, Aug 30, 2011 at 7:31 AM, Mark Steele <msteele at beringmedia.com>wrote:
>>>
>>>> Hi folks,
>>>>
>>>> Just want to know if I'm doing something obviously dumb here.
>>>>
>>>> First the (PHP) code (Sorry of the API is different from the official
>>>> API, we're using a heavily modified version):
>>>>
>>>> <snip>
>>>> $obj1 = $bucket->newObject('linkkey1', array('link1'));
>>>> $obj2 = $bucket->newObject('linkkey2', 'dataforlinkkey2');
>>>> $obj1->addLink($obj2);
>>>> $obj2->addLink($obj1);
>>>> $obj1->store();
>>>> $obj2->store();
>>>>
>>>> $start = microtime(true);
>>>> $blargh = $bucket->get('linkkey2');
>>>> $end = microtime(true);
>>>> printf("Took : %04f\n",$end - $start);
>>>>
>>>> $start = microtime(true);
>>>>  $result =
>>>>     $client->add($bucketname, 'linkkey1')->
>>>>     link()->
>>>>     map(array("riak_kv_mapreduce", "map_object_value")) ->
>>>>     run();
>>>>   foreach ($result as $data) {
>>>>     //var_dump($data);
>>>>   }
>>>> $end = microtime(true);
>>>> printf("Took : %04f\n",$end - $start);
>>>> <snip>
>>>>
>>>> So here's what I'm seeing:
>>>>
>>>> The simple key fetch takes 0.000661 seconds to execute, whereas the
>>>> link-walk takes 0.042043. Ouch. Quite a bit slower. Any ways to speed
>>>> this up?
>>>>
>>>> Cheers,
>>>>
>>>> Mark Steele
>>>> Bering Media Inc.
>>>>
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110830/f9121bc8/attachment.html>


More information about the riak-users mailing list