slow mapred_search key lookups for single terms

Michael Radford mrad at blorf.com
Fri Apr 6 11:00:29 EDT 2012


Calling riak_kv_mrc_pipe directly from the console is just as slow as
using the pb interface.

Re: the pb interface itself, I noticed looking at the fprof analysis
that it seemed to be encoding one PB message per search result and
spending quite a bit of time doing it, e.g., this stanza:

{[{{riakc_pb,encode,1},                        15188, 7779.607, 1063.390}],
 { {riakclient_pb,iolist,2},                   15188, 7779.607, 1063.390},     %
 [{{riakclient_pb,pack,5},                     45564, 6304.788,  482.190},
  {{riakclient_pb,with_default,2},             45564,  406.986,  406.986},
  {garbage_collect,                               2,    2.792,    2.792},
  {suspend,                                     149,    1.651,    0.000}]}.

and similar numbers on the client side. So if there's an opportunity
for batching the search results, that seems like it would be a big
improvement as well.

(It looks like that is currently up to whatever process is sending the
results in the first place. But maybe it would be a good idea for the
pb socket to accumulate M/R results until it reaches some threshold of
bytes to send...calling term_to_binary to estimate is pretty fast.)

Mike

On Fri, Apr 6, 2012 at 5:56 AM, Bryan Fink <bryan at basho.com> wrote:
> On Thu, Apr 5, 2012 at 3:51 PM, Bryan Fink <bryan at basho.com> wrote:
>> This *might* be the wrong intuition for Search, since there is
>> funneling happening to process the query anyway, but it's likely a
>> good place to start.
>
> Whoops.  I *really* should have included a suggestion for how to
> figure out whether this is the case, or if it is PB related.  The most
> straightforward test will be to run the same query one more time, but
> this time use the `riak_kv_mrc_pipe` module.  Attach to the Riak
> console, just as you did for the internal-cluster client test, but
> instead of
>
>    {ok, Client} = riak:local_client(),
>    Client:mapred(Inputs, Phases).
>
> use instead
>
>    riak_kv_mrc_pipe:mapred(Inputs, Phases).
>
> If that is also much slower than the native client, then the problem
> is not with the PB interface.  If it's nearly on-par, then the PB
> interface needs inspection.
>
> -Bryan




More information about the riak-users mailing list