map function for link-walking

Bryan Fink bryan at
Sat Jul 10 18:20:36 EDT 2010

On Sat, Jul 10, 2010 at 4:45 PM, Nicolas Fouché <nicolas at> wrote:
> In the "one-to-very-many link associations" thread , Sean Cribbs talks
> about a map function which does link-walking from links stored in
> object contents.
> "Another way to cope with large numbers of links is to
> encapsulate them in the object itself, rather than in the headers.  This removes
> the header-length/count limitation, but would require you to have a map function
> that understands the internals of the object.  Also, you would need to deal with
> the larger size of the object, which could potentially slow down your request."
> Is there any chance someone shares the code of a map function doing
> this (custom-)link-walking ?

Hi, Nicolas.  Any function you have that returns a list of bucket-key
pairs, in the same format as the "inputs" list for the map/reduce
query, will work.  For example, if you stored your object's links in a
"mylinks" field in it's value, like so:

$ curl -X PUT -H "content-type:application/json"
http://localhost:8098/riak/example/foo --data @-
$ curl -X PUT -H "content-type:application/json"
http://localhost:8098/riak/example/bar --data @-
$ curl -X PUT -H "content-type:application/json"
http://localhost:8098/riak/example/baz --data @-

Then you could use a very simple map function like:
   function(v) {
      return v.not_found ? [] : JSON.parse(v.values[0].data).mylinks;

And then the link-walking is simple:

carboy:riak bryan$ curl -X POST -H "content-type:application/json"
http://localhost:8098/mapred --data @-
{ return v.not_found ? [] : JSON.parse(v.values[0].data).mylinks;
}"}},{"map":{"language":"javascript","source":"function(v) { return
[JSON.parse(v.values[0].data).myval]; }"}}]}

That query uses two map phases to start at the example/foo object I
created above, and then follow the links it has to the example/bar and
example/baz, and extracting the "myval" field from the values of those

I'd recommend adding a little defensive programming in to make sure
that "mylinks" is defined, and that it's a list of the proper shape.
It would also be a good idea to define these function in a file that
Riak would preload, instead of specifying them dynamically in the
query (for performance).  But, you could also take it in another
direction: if you knew that all of your links were going to point to
objects in a certain bucket, you could store just the keys in the
object, and produce bucket-key pairs with a quick map function  (e.g. { return ["otherbucket", k]; })

Hope that helps.


More information about the riak-users mailing list