Mapreduce crosstalk

Aphyr aphyr at
Tue May 17 15:35:11 EDT 2011

I was writing a new mapreduce query to look at users over time, and ran 
it over a single user in production. After that, other mapreduce jobs 
over users started returning results from my new map phase, some of the 
time. After five minutes of this, I had to restart every node in the 
cluster to get it to stop.

Every node has {map_cache_size, 0} in riak_kv.

The map phase that screwed things up was:

function(v) {
   o = JSON.parse(v.values[0].data);

   // Age of account in days
   age = Math.round(
     ( - Date.iso8601(o.created_at)) /
     (1000 * 60 * 60 * 24)

   return [['t_user_scores', v.key, age]];

It looks like one node started running that phase instead of the 
requested phase for subsequent jobs. It *should* have run this one, but 

function(v) {
	o = JSON.parse(v.values[0].data);
	return [{
		key: v.key,
		thumbnail: o.thumbnail

Now I'm scared to run MR jobs. Could it be an issue with returning 
keydata? Anybody else seen this before?


More information about the riak-users mailing list