Map Reduce and long queries -

David Montgomery davidmontgomery at
Sun Oct 14 07:57:23 EDT 2012


Below is my code for running a map reduce in python.  I have a six
node cluster, 2 cores each with 4 gigs for ram.  I am no load and
about 3 Mill keys and using leveldb with riak 1.2.  Doing  the below
is taking a terribly long time.  Never finished and I dont even know
how I can check if it is even running other than the python script has
not timed out.  I look at the number of executed mappers in stats and
it is flat lined when looking at Graphite.  On test queries the below
works. do I debug what is going on?

def main():
    client  = riak.RiakClient(host=riak_host,port=8087,transport_class=riak.transports.pbc.RiakPbcTransport)
    query = client.add(bucket)
    filters = key_filter.tokenize(":", filter_map['date']) +
              #&  key_filter.tokenize(":", filter_map['country']).eq("US") \
              #&  key_filter.tokenize(":", filter_map['campaign_id']).eq("t1") \
    function(value, keyData, arg) {
        var data = Riak.mapValuesJson(value)[0];

            var alt_key = data['hw'];
            var obj = {};
            obj[alt_key] = 1;
            return [ obj ];
           return [];


    function(values, arg){
        return [ values.reduce( function(acc, item) {
            for (var state in item) {
                if (acc[state])
                    acc[state] += item[state];
                    acc[state] = item[state];
            return acc;

    for result in
        print result

More information about the riak-users mailing list