Same MR query, different results every run...........

Christian Dahlqvist christian at basho.com
Tue Jan 8 03:20:09 EST 2013


Hi David,

Is it always the same entry that is missing from the result set? If so, does the issue go away if you issue a read request for the record(s) causing problems (resulting in read-repair)?

If this is the case, the cause of the problem might be explained by how MapReduce works in Riak.

In Riak, map phases are sent to a covering set of vnodes, and are executed where the data resides. When data is read in order to be passed in to the map phase, only the data stored on that vnode is considered (effectively a read with R=1 - not the default R value specified for the bucket). If all vnodes do not have the same version of the data, I believe it is possible the results may vary slightly between runs if the covering sets differ.

Best regards,

Christian



On 7 Jan 2013, at 23:33, David Montgomery <davidmontgomery at gmail.com> wrote:

> Hi,
> 
> i do have a reduce phase
> 
> 
> On Tue, Jan 8, 2013 at 12:08 AM, Mridul Kashatria <mridul at readwhere.com> wrote:
> Hi,
> 
> If I am correct, adding a reduce function should return the same number of items.
> 
> I'm a riak noob but I faced some similar issue while testing with a map function only. Adding a reduce fixed it.
> 
> I believe as the map fans out to multiple nodes, whichever node returns data first is written to output and not collected by a reduce stage.
> 
> Please correct me if I'm wrong.
> 
> Thanks
> 
> --
> Mridul
> 
> 
> 
> On Sunday 06 January 2013 11:07 AM, David Montgomery wrote:
>> Hi,
>> 
>> Here is my my mapper...
>> 
>> query.map('''
>>     function(value, keyData, arg) {
>>     
>>         if(value.length == 0){
>>            return [];
>>         }else{
>>             var data = Riak.mapValuesJson(value)[0];
>>             var obj = {};
>>             if(data['campaign_id']=='%s'){
>>                 try{
>>                     var alt_key = data['ckid'] + '||' + data['gid'] + '||' + data['ts_hms'];  
>>                 }
>>                 catch(err){
>>                     var alt_key = 'error';
>>                 }
>>                 obj[alt_key] = 1;
>>                 return [ obj ];
>>             }else{
>>                return [];
>>             }
>>         }
>>     }''' % campaign_id)
>> 
>> When I run  the the query repeatedly, over and over, about every 2 seconds I get the below.  A few times I get 14 rows and a few times I get 13 then back to 14 etc.  So.....why?  There should be no variation.  I have a three node cluster, two cores, 4 gigs or ram on ubuntu 12.06 using the latest riak.
>> 
>> 
>> TOTAL: 14
>> 3dc3f58f-faea-4751-94b5-8a9a076d4b3f||CAESEGYMM1Q34DV8Ev0i12IVKdY||2012-12-31 08:36:21 1
>> b4d82fa0-5cd4-4813-a150-554ebca30f1f||CAESEM98NHldIIyAzY0CIUnKudw||2013-01-04 06:18:37 1
>> 8743af22-a664-4b60-ac59-b79d52c12e9e||CAESEH2PIdEYXvk3Dsg2_vF6Qcc||2013-01-04 09:13:30 1
>> cef36621-527c-4b7a-be6f-5842e13a1350||CAESEHsyPPSizUsT-j31I-nCLzQ||2013-01-05 12:50:22 1
>> 663fb22d-c60d-46b7-8b5b-c9be103c2084||CAESEDtHYmtttm7DBCRpCSU9zYE||2013-01-04 08:55:06 1
>> e2b6afda-b838-48d5-a449-7b568b9f6b04||CAESEBciJaIqccs2584wIgdsOqc||2013-01-04 04:02:13 1
>> 66aa05fe-9c55-43b2-93ae-c8cb19d097d7||CAESEBuVyK-X_iNGaiiLhPsT0TE||2013-01-02 01:29:38 1
>> 0969a7ca-4324-4118-9038-b6fc11f08a36||CAESENwCD1bw1VvtIamGBCUl_zk||2013-01-02 00:55:01 1
>> f78b77f6-a08c-4f07-b982-7b2cdcefba4f||CAESEJiWNlcbRN7Sx9o2FB7fbaU||2012-12-29 05:22:46 1
>> 8050e5a7-1583-459a-983f-55feaf0e2a6c||CAESED2NyW9XDEbiKb1UD4sTzvI||2013-01-05 12:18:59 1
>> 58b84566-ad3a-4a3f-91bd-1c61986fbadb||CAESELQcGkigDvXrtRDgOlw9rX0||2013-01-04 16:19:25 1
>> 0db77e8d-ed94-43cf-8860-b4e43dfa24aa||CAESECbwN7VY6o8om79mZ905GIA||2013-01-02 16:15:34 1
>> 67e79552-7e06-44bd-9e95-87f7cb634de3||CAESEFA6fd_C1PBslKgOj6_BI28||2012-12-29 05:23:11 1
>> ffc3c6ae-beee-4dfe-b41d-ec3a72bddf67||CAESEN_MAXs55jCPIwuyvfTZIZc||2012-12-28 07:56:03 1
>> 
>> 
>> TOTAL: 13
>> b4d82fa0-5cd4-4813-a150-554ebca30f1f||CAESEM98NHldIIyAzY0CIUnKudw||2013-01-04 06:18:37 1
>> 8743af22-a664-4b60-ac59-b79d52c12e9e||CAESEH2PIdEYXvk3Dsg2_vF6Qcc||2013-01-04 09:13:30 1
>> cef36621-527c-4b7a-be6f-5842e13a1350||CAESEHsyPPSizUsT-j31I-nCLzQ||2013-01-05 12:50:22 1
>> 663fb22d-c60d-46b7-8b5b-c9be103c2084||CAESEDtHYmtttm7DBCRpCSU9zYE||2013-01-04 08:55:06 1
>> e2b6afda-b838-48d5-a449-7b568b9f6b04||CAESEBciJaIqccs2584wIgdsOqc||2013-01-04 04:02:13 1
>> 66aa05fe-9c55-43b2-93ae-c8cb19d097d7||CAESEBuVyK-X_iNGaiiLhPsT0TE||2013-01-02 01:29:38 1
>> 0969a7ca-4324-4118-9038-b6fc11f08a36||CAESENwCD1bw1VvtIamGBCUl_zk||2013-01-02 00:55:01 1
>> f78b77f6-a08c-4f07-b982-7b2cdcefba4f||CAESEJiWNlcbRN7Sx9o2FB7fbaU||2012-12-29 05:22:46 1
>> 8050e5a7-1583-459a-983f-55feaf0e2a6c||CAESED2NyW9XDEbiKb1UD4sTzvI||2013-01-05 12:18:59 1
>> 58b84566-ad3a-4a3f-91bd-1c61986fbadb||CAESELQcGkigDvXrtRDgOlw9rX0||2013-01-04 16:19:25 1
>> 3dc3f58f-faea-4751-94b5-8a9a076d4b3f||CAESEGYMM1Q34DV8Ev0i12IVKdY||2012-12-31 08:36:21 1
>> 67e79552-7e06-44bd-9e95-87f7cb634de3||CAESEFA6fd_C1PBslKgOj6_BI28||2012-12-29 05:23:11 1
>> 0db77e8d-ed94-43cf-8860-b4e43dfa24aa||CAESECbwN7VY6o8om79mZ905GIA||2013-01-02 16:15:34 1
>> 
>> 
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130108/6035688d/attachment.html>


More information about the riak-users mailing list