MapReduce filtering question

Parker Thompson parkert at gmail.com
Fri Nov 19 14:15:27 EST 2010


I'm experimenting with Riak by trying to port a simple a/b testing framework
that's currently SQL backed. Since I'm using Ripple/riak-client my code
below are in Ruby/JS.

The domain model is fairly simple. I have visitors, which get created for
any user who hits the site, visitors see alternatives (currently these are
ActiveRecord objects) and are tracked by creating experiences (the joining
of a alternative ID and a visitor). Finally, as visitors do things we track
events, which are distinguished from one another by their classes.

Here is a simplified version of the model code:

class Riak::Visitor
  include Ripple::Document
  many :events,      :class_name => "Riak::Event"
end

class Riak::Event
  include Ripple::Document
end

class Riak::ShareEvent < Riak::Event
  include Ripple::Document
end

class Riak::Experience
  include Ripple::Document
  one :visitor, :class_name => "Riak::Visitor"
  property :alternative_id, Integer, :presence => true
end

My problem is that I'd like to collect the set of visitors who have shared,
or more generally I'd like to return a set of visitors after narrowing down
the list by linking in specific kind of events. Well, my real problem is
that I still don't quite grok MapReduce, but this is what I'm trying to
accomplish.

The riak-client code is included below (see visitors_who_shared). It returns
a list of all visitors found in the map phase where keep is true. This isn't
surprising, but I'm not sure how to get the visitors if I don't “keep” them
in that phase.

Thanks in advance for any help. I'm also happy RTFM and would appreciate
specific suggestions for doing nontrivial MR jobs in JavaScript.

class Riak::Alternative #not a riak doc
  attr_accessor :ar_id

  def initialize(ar_id)
    self.ar_id = ar_id
  end

  def visitors_who_shared
    Riak::MapReduce.new(Ripple.client).
            add("riak_experiences").
            map(map_filter_by_alternative).
            link(:bucket => 'riak_visitors', :keep => true).
            link(:bucket => 'riak_events').
            map("function(v){ return [[v.bucket, v.key]]; }").
            map(map_share_events).
            run
  end

  def map_share_events
    f = <<FUNCTION
function(v){
  var data = JSON.parse(v.values[0].data);
  if(data._type != "Riak::ShareEvent" ){
    return [];
  }else{
    return [[v.bucket, v.key]];
  }
}
FUNCTION
  end

  def map_filter_by_alternative
    f = <<FUNCTION
function(v){
  var data = JSON.parse(v.values[0].data);
  if(data.alternative_id !=  #{self.ar_id} ){
    return [];
  }else{
    return [[v.bucket, v.key]];
  }
}
FUNCTION
  end
end

Riak::Alternative.new(1).visitors_who_shared
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101119/184ae4cb/attachment.html>


More information about the riak-users mailing list