Performance of link walking versus map/reduce

Bryan Fink bryan at basho.com
Fri Feb 3 14:09:51 EST 2012


On Fri, Feb 3, 2012 at 12:31 PM, Nicolas Petton
<petton.nicolas at gmail.com> wrote:
> curl
> http://localhost:8098/buckets/artists/keys/pink_floyd/albums,author,_
> was nearly immediate, it took 0.156s
>
> while:
>
> curl -X POST -H "content-type:application/json" \
>  http://localhost:8098/mapred --data @-
> {"inputs":[["artists","pink_floyd"]],"query":[{"link":{"bucket":"albums","tag":"author"}},{"map":{"language":"javascript","source":"function(v)
> { return [v]; }"}}]}
>
> Took nearly 2 seconds.

The biggest difference I see is that the link-walk uses an Erlang
function where your MapReduce query uses a Javascript function
(link-walking is implemented as a MapReduce query internally).
Serializing/deserializing to JSON as well as contention for Javascript
VMs likely accounts for the lost time.

Unfortunately, you can't use exactly the same Erlang function here
(riak_kv_mapreduce:map_identity), since the /mapred resource doesn't
know how to encode its output to JSON.

-Bryan



More information about the riak-users mailing list