slow mapred_search key lookups for single terms

Michael Radford mrad at blorf.com
Wed Apr 4 11:28:07 EDT 2012


Aha, I just noticed that the native erlang client is still using
luke_flow to implement its map-reduce, rather than riak_pipe.  On some
level, this must be the reason for the differing behavior...either a
bug in riak_pipe, or a bug in the usage of riak_pipe somewhere in the
chain?

Like, maybe someone is forgetting to send end-of-input and causing a
5-second timeout to occur, something like that?  (Just wild
speculation.)

Mike

On Mon, Apr 2, 2012 at 2:39 PM, Ryan Zezeski <rzezeski at basho.com> wrote:
> Michael,
>
> These are some interesting observations.  Thank you for digging into this
> more.  At first glance it appears that it has something to do with the PB
> client/server.  If you are feeling adventurous might I suggest using fprof
> to analyze where the time is being spent.  I wrote up a gist on how you can
> do that here:
>
> https://gist.github.com/2287383
>
> I ran this analysis myself on a much smaller dataset.  A quick glance at the
> results shows most of the time being spent in suspend, encode/decode (which
> I suspected), and queuing data in riak_pipe.  I'm really curious to see what
> your analysis looks like.  If you can produce one I'm happy to help analyze.
>  Thanks again for your detailed analysis thus far.
>
> -Ryan
>
>
>
> On Mon, Apr 2, 2012 at 2:33 PM, Michael Radford <mrad at blorf.com> wrote:
>>
>> I modified the pb client code to print the elapsed time when it
>> receives the first map-reduce result (in the function
>> wait_for_mapred), and tried to run a more apples-to-apples comparison
>> of the same map-reduce job via the local client and the pb client
>> connecting to 127.0.0.1.
>>
>> In a nutshell, the majority of the extra delay from the pb client
>> occurs before it receives the first result (e.g., 6 out of 7 seconds).
>> (It's still troubling that it takes another second to receive the rest
>> of the results when reading from localhost, but maybe I can avoid that
>> by calling a different reduce phase that somehow packages my results
>> into a single message?)
>>
>> Next, it seems like I should try instrumenting the server side of the
>> protobufs interface with timing...any pointers on the best place(s) to
>> do that would be appreciated.
>>
>> Here is the function I used with the pb client to localhost, and the
>> results:
>> TP =
>>  fun (Bucket, Query) ->
>>    {ok, Pid} = riakc_pb_socket:start_link("127.0.0.1", 8087),
>>    Inputs = {modfun, riak_search, mapred_search, [Bucket, Query]},
>>    Phases =
>>      [ {reduce,
>>        {modfun, riak_kv_mapreduce, reduce_identity},
>>        [{reduce_phase_only_1, true}],
>>        true} ],
>>    {MuSec, {ok, [{0, Results}]}} =
>>      timer:tc (fun () -> riakc_pb_socket:mapred(Pid, Inputs, Phases) end),
>>    riakc_pb_socket:stop (Pid),
>>    io:format(user, "~b results, ~f sec~n", [length(Results),
>> MuSec/1000000])
>>  end.
>>
>> 3> TP(Bucket,<<"full_text:flower">>).
>> first M-R result in 5848490 musec
>> 14378 results, 7.235226 sec
>> ok
>> 4> TP(Bucket,<<"full_text:flower">>).
>> first M-R result in 5978044 musec
>> 14378 results, 7.423933 sec
>> ok
>> 5> TP(Bucket,<<"full_text:flower">>).
>> first M-R result in 1219112 musec
>> 14378 results, 2.563079 sec
>> ok
>> 6> TP(Bucket,<<"full_text:flower">>).
>> first M-R result in 6017087 musec
>> 14378 results, 7.326352 sec
>> ok
>> 7> TP(Bucket,<<"full_text:green">>).
>> first M-R result in 9546654 musec
>> 24046 results, 13.561474 sec
>> ok
>> 8> TP(Bucket,<<"full_text:green">>).
>> first M-R result in 1878109 musec
>> 24047 results, 5.188646 sec
>> ok
>> 9>
>> 9> TP(Bucket,<<"full_text:green">>).
>> first M-R result in 9724651 musec
>> 24047 results, 12.692941 sec
>> ok
>> 10> TP(Bucket,<<"full_text:green">>).
>> first M-R result in 9627592 musec
>> 24047 results, 12.769256 sec
>> ok
>>
>> And here is the function I used for the local client, and the results:
>> TL =
>>  fun (Bucket, Query) ->
>>    {ok, Client} = riak:local_client(),
>>    Inputs = {modfun, riak_search, mapred_search, [Bucket, Query]},
>>    Phases =
>>      [ {reduce,
>>        {modfun, riak_kv_mapreduce, reduce_identity},
>>        [{reduce_phase_only_1, true}],
>>        true} ],
>>    {MuSec, {ok, Results}} =
>>      timer:tc(fun () -> Client:mapred(Inputs, Phases) end),
>>    io:format(user, "~b results, ~f sec~n", [length(Results),
>> MuSec/1000000])
>>  end.
>>
>> (riak at 10.174.223.37)27> TL(Bucket,<<"full_text:flower">>).
>> 14340 results, 0.106422 sec
>> ok
>> (riak at 10.174.223.37)28> TL(Bucket,<<"full_text:flower">>).
>> 14340 results, 0.102255 sec
>> ok
>> (riak at 10.174.223.37)29> TL(Bucket,<<"full_text:flower">>).
>> 14340 results, 0.099877 sec
>> ok
>> (riak at 10.174.223.37)30> TL(Bucket,<<"full_text:blue">>).
>> 25017 results, 0.210381 sec
>> ok
>> (riak at 10.174.223.37)31> TL(Bucket,<<"full_text:blue">>).
>> 25017 results, 0.200112 sec
>> ok
>> (riak at 10.174.223.37)32> TL(Bucket,<<"full_text:blue">>).
>> 25018 results, 0.207553 sec
>> ok
>> (riak at 10.174.223.37)33> TL(Bucket,<<"full_text:green">>).
>> 23945 results, 0.173061 sec
>> ok
>> (riak at 10.174.223.37)34> TL(Bucket,<<"full_text:green">>).
>> 23945 results, 0.191595 sec
>> ok
>> (riak at 10.174.223.37)35> TL(Bucket,<<"full_text:green">>).
>> 23945 results, 0.177821 sec
>> ok
>>
>> Mike
>>
>> On Mon, Apr 2, 2012 at 8:46 AM, Michael Radford <mrad at blorf.com> wrote:
>> > Hi Ryan,
>> >
>> > This is getting interesting: the same queries when executed using
>> > local clients from 'riak attach' are taking only 100-250 ms.
>> >
>> > However, I just tried the same test I was running remotely on Saturday
>> > on one of the machines in the Riak cluster, using the protobufs client
>> > to connect to 127.0.0.1, and it's still taking 6-7 seconds per query.
>> > (Still with an occasional dip down to 2-3 seconds).
>> >
>> > These machines are on Amazon EC2, so I have little control over the
>> > network layout. But that includes the 4 Riak boxes, so if their
>> > inter-node communication was suffering from similar issues I would
>> > have expected to see it affecting other map-reduce queries. (We're
>> > running lots of multi-key lookups via map-reduce that return thousands
>> > of objects in a few ms, much larger than the keys returned from these
>> > searches.)
>> >
>> > And again, there is a huge difference between the same query using the
>> > local client (200 ms), and using the protobufs client from the exact
>> > same Riak machine connecting to localhost (6-7 sec but occasionally
>> > 2-3 sec).
>> >
>> > Mike
>> >
>> > On Mon, Apr 2, 2012 at 7:44 AM, Ryan Zezeski <rzezeski at basho.com> wrote:
>> >> Hi Michael, you'll find my responses inline...
>> >>
>> >> On Sat, Mar 31, 2012 at 5:04 PM, Michael Radford <mrad at blorf.com>
>> >> wrote:
>> >>>
>> >>> I'm seeing very slow performance from Riak search even when querying
>> >>> single terms, and I'd appreciate any advice on how to get insight into
>> >>> where the time is going.
>> >>>
>> >>> Right now, I'm using this function to time queries with the Erlang pb
>> >>> client:
>> >>>
>> >>> TS =
>> >>>  fun (Pid, Bucket, Query) ->
>> >>>    T0 = now(),
>> >>>    {ok, Results} = riakc_pb_socket:search(Pid, Bucket, Query),
>> >>>    MuSec = timer:now_diff(now(), T0),
>> >>>    io:format(user, "~b results, ~f sec~n", [length(Results),
>> >>> MuSec/1000000])
>> >>>  end.
>> >>
>> >>
>> >> Just an FYI, you should checkout `timer:tc`.
>> >>>
>> >>>
>> >>> The bucket I'm querying currently has ~300k keys total (each 16
>> >>> bytes). (The whole cluster has maybe 1.5M entries in about a dozen
>> >>> buckets. It's running 1.0.2, 4 nodes on 4 8-core c1.xlarge EC2
>> >>> instances.)
>> >>>
>> >>> For single-term queries that return 10k+ keys, I'm routinely seeing
>> >>> times above 6 seconds to run the above function. Intermittently,
>> >>> however, I'll see the same queries take only 2 seconds:
>> >>>
>> >>> > TS(Pid,Bucket,<<"full_text:flower">>).
>> >>> 12574 results, 6.094149 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:flower">>).
>> >>> 12574 results, 1.938894 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:flower">>).
>> >>> 12574 results, 1.981492 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:flower">>).
>> >>> 12574 results, 6.120589 sec
>> >>> ok
>> >>>
>> >>> > TS(Pid,Bucket,<<"full_text:red">>).
>> >>> 13461 results, 6.572473 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:red">>).
>> >>> 13461 results, 6.626049 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:red">>).
>> >>> 13461 results, 2.155847 sec
>> >>> ok
>> >>>
>> >>> Queries with fewer results are still slow, but not quite as slow as 6
>> >>> seconds:
>> >>>
>> >>> > TS(Pid,Bucket,<<"full_text:ring">>).
>> >>> 6446 results, 2.831806 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:ring">>).
>> >>> 6446 results, 3.037162 sec
>> >>> ok
>> >>> > TS(Pid,Bucket,<<"full_text:ring">>).
>> >>> 6447 results, 0.780944 sec
>> >>> ok
>> >>>
>> >>> And queries with no matches only take a few milliseconds:
>> >>>
>> >>> > TS(Pid,Bucket,<<"full_text:blorf">>).
>> >>> 0 results, 0.003269 sec
>> >>> ok
>> >>>
>> >>> During the slow queries, none of the 4 machines seems to be fully
>> >>> taxing even one cpu, or doing almost any disk i/o.
>> >>
>> >>
>> >> What does intra/inter network look like?
>> >>
>> >>>
>> >>>
>> >>> As far as I can tell from looking at the riak_kv/riak_search source,
>> >>> my query should only be hitting the index and streaming back the keys,
>> >>> not trying to read every document from disk or sort by score. Is that
>> >>> correct?
>> >>
>> >>
>> >> It will not read the documents at all but it will sort on score.
>> >>  Currently
>> >> there is no way to disable sorting.
>> >>
>> >>>
>> >>>
>> >>> Assuming that's the case, I can't imagine why it takes so long to look
>> >>> up 10k keys from the index for a single term, or why the times seem to
>> >>> be bimodal (which seems like a big clue). Any pointers welcome!
>> >>
>> >>
>> >> Where is your client sitting in regards to your cluster?  Is it in the
>> >> local
>> >> network?  Could you try attaching to one of your riak nodes, running
>> >> the
>> >> query there and compare results?
>> >>
>> >> e.g.
>> >>
>> >> riak attach
>> >>
>> >>> search:search(Bucket, Query).
>> >>
>> >> -Ryan
>> >>
>
>




More information about the riak-users mailing list