2i performance questions

Ryan Zezeski rzezeski at basho.com
Mon Jun 18 10:26:59 EDT 2012

Hi Tom, response inline.

On Thu, Jun 14, 2012 at 3:29 PM, Tom Burdick <thomas.burdick at gmail.com> wrote:

> retrieve_by_client(Db, Key) when is_binary(Key) ->
>    {ok, BKeys} = riakc_pb_socket:get_index(Db, ?bucket,
> <<"client_id_bin">>, Key),
>    lists:map(fun([_Bucket, Key0]) ->
>        Key0
>    end, BKeys).
> What I've noticed while benchmarking the put is that that is actually
> quite fast and I can do that
> at a pretty high rate even on my devrel setup, like 1000 puts/sec.
> However when doing retrieve_by_client at most I've seen 20
> get_indexes/sec with latencies all over the place from a few
> milliseconds up to a few seconds.

I notice you're using the Erlang PB client.  This is the client we
recommend but a current shortfall of it re 2i is that it unnecessarily
uses map/reduce to collate the results.

Bryan Find recently made some changes to Pipe, the system behind
map/reduce, that should improve latencies for queries that match
"larger" result sets.  How large I'm not sure, but the larger the
result set the more of an improvement you'll see is what I'm told.

Sean Cribbs recently made some change to our protobuffs server that
will allow direct querying of 2i without the use of map/reduce.  IMO
this should greatly reduce your latencies.  However, the various
clients still need to be modified to take advantage of this.  I defer
to Sean on when that might be done.

In the meantime you could try changing your query code to use the HTTP
interface and see if that brings you better latency.

Oh, I just remember, Matthew Von-Maszewski has made some amazing
strides in our branch of leveldb in terms of
latency/throughput/robustness and that should result in trickle-down
improvements as well.

All these changes, with the exception of the client modifications,
should be coming with our next release, 1.2.

> I'm tempted to fire up fprof on Client:get_index just to see whats up
> but I'm a bit scared at what I might see there :-)

Feel free to fprof, my guess is most of the time will be spent in
map/reduce related code.  The other place 2i spends a majority of it's
time is sext decoding.


More information about the riak-users mailing list