Inconsistent map/reduce results

Keith Dreibelbis kdreibel at gmail.com
Thu Mar 31 19:38:19 EDT 2011


Thanks for the responses.  Changing vnode_cache_entries (0.13) to
map_cache_size
(0.14) mostly made the problem go away, and I was able to remove the random
seed hack.

However, now I've got a different problem.  When I do a map reduce that is
supposed to return 300 items, I get some sort of timeout.

   java.io.IOException: bad message code

at com.basho.riak.pbc.RiakConnection.receive(RiakConnection.java:93)
~[riak-client-0.14.1-SNAPSHOT.jar:na]

at
com.basho.riak.pbc.MapReduceResponseSource.get_next_response(MapReduceResponseSource.java:78)
~[riak-client-0.14.1-SNAPSHOT.jar:na]

at
com.basho.riak.pbc.MapReduceResponseSource.hasNext(MapReduceResponseSource.java:49)
~[riak-client-0.14.1-SNAPSHOT.jar:na]


The script that Dan provided gives me this error:


kratos:Downloads keith$ sh ./test.sh keiths_test_bukkit 8091 300

===== Loading Data (keiths_test_bukkit) =====

Done

===== Running MapReduce (keiths_test_bukkit) =====

{"error":"timeout"}

kratos:Downloads keith$


I think this has something to do with the fact that my #2 node (of my dev123
cluster) is down and refuses to start again.  But perhaps this is a subject
for another thread, after I investigate how to repair a down riak node that
won't restart.



Keith


On Thu, Mar 31, 2011 at 1:17 PM, Matthew Heitzenroder <
mheitzenroder at gmail.com> wrote:

> setting the map_cache_size to 0 is workaround for bug 969<https://issues.basho.com/show_bug.cgi?id=969>.
> If map_cache_size=0 alleviates the issue, then upgrading to 0.14.1 should
> resolve the issue with the MapReduce cache.
>
>
> On Thu, Mar 31, 2011 at 12:20 PM, Dan Reverri <dan at basho.com> wrote:
>
>> Hi Keith,
>>
>> The cache entry parameter name changed in 0.14 to "map_cache_size".
>> Setting this parameter to 0 will disable the cache.
>>
>> Regarding the empty MapReduce results, I'll try to reproduce the issue
>> locally and narrow down the cause.
>>
>> Thanks,
>> Dan
>>
>> Daniel Reverri
>> Developer Advocate
>> Basho Technologies, Inc.
>> dan at basho.com
>>
>>
>>
>> On Tue, Mar 29, 2011 at 6:16 PM, Keith Dreibelbis <kdreibel at gmail.com>wrote:
>>
>>> Followup to this (somewhat old) thread...
>>>
>>> I had resolved my problem by putting the vnode_cache_entries=0 thing in
>>> app.config, doing what Grant said.  But sometime later it began failing
>>> again.  I was getting misses of 25%-50% on records that should have been
>>> found by map reduce but weren't.  At that point I tried Rohman's suggestion
>>> of using a random seed, and that worked around the problem successfully.
>>>  But this isn't a very satisfying fix.
>>>
>>> So the vnode_cache_entries=0 thing doesn't really fix it after all?  Is
>>> there something else to put in the config that would make this work
>>> properly, without the random seed hack?  BTW since the original thread I
>>> have upgraded from 0.13 to 0.14, and the bug is still there.
>>>
>>>
>>> Keith
>>>
>>>
>>> On Thu, Mar 10, 2011 at 6:56 PM, Antonio Rohman Fernandez <
>>> rohman at mahalostudio.com> wrote:
>>>
>>>> if you want to avoid caching ( without configuration ), you can put some
>>>> random variable in your map or reduce or both... that does the trick for me
>>>> as the query will be always different:
>>>>
>>>> $seed = randomStringHere;
>>>>
>>>> {"map":{"language":"javascript","source":"function(v,k,a) {
>>>> seed='.$seed.'; x=Riak.mapValuesJson(v)[0]; return [v.values[0].data]; }"}
>>>>
>>>> Rohman
>>>>
>>>> On Thu, 10 Mar 2011 17:47:49 -0800, Keith Dreibelbis <
>>>> kdreibel at gmail.com> wrote:
>>>>
>>>> Thanks for the prompt response, Grant.  I made the configuration change
>>>> you suggested, and it fixed my problem.
>>>>  Some followup questions:
>>>>  - is it possible to configure this dynamically on a per-bucket basis,
>>>> or just per-server like it is now?
>>>> - is this fixed in a newer version?
>>>>
>>>> On Thu, Mar 10, 2011 at 2:56 PM, Grant Schofield <grant at basho.com>wrote:
>>>>
>>>>> There are currently some bugs in the mapreduce caching system. The best
>>>>> thing to do would be to disable the feature, on 0.13 you can do this by
>>>>> editing or adding the vnode_cache_entries to the riak_kv section of
>>>>> your app.config. The entry would look like:
>>>>> {vnode_cache_entries, 0},
>>>>>
>>>>>  Grant Schofield
>>>>> Developer Advocate
>>>>> Basho Technologies
>>>>>
>>>>>   On Mar 10, 2011, at 4:16 PM, Keith Dreibelbis wrote:
>>>>>
>>>>>  Hi riak-users,
>>>>> I'm trying to do a map/reduce query from java on a 0.13 server, and get
>>>>> inconsistent results.  What I'm doing should be pretty simple.  I'm hoping
>>>>> someone will notice an obvious error in here, or have some insight:
>>>>>  This is an automated test.  I'm doing a simple query where I'm trying
>>>>> to get the keys for records with a certain field value.  In SQL it would
>>>>> look like "SELECT id FROM table WHERE age = '32'".  In java I'm invoking it
>>>>> like this:
>>>>>    MapReduceResponse r = riak.mapReduceOverBucket(getBucket())
>>>>>         .map(JavascriptFunction.anon(func), true)
>>>>>              .submit();
>>>>>  where riak is a RiakClient, getBucket() returns the name of the
>>>>> bucket, and func is a string that looks like:
>>>>>  function(value, keyData, arg) {
>>>>>        var data = Riak.mapValuesJson(value)[0];
>>>>>        if(data.age == "32")
>>>>>          return [value.key];
>>>>>       else
>>>>>          return [];
>>>>>    }
>>>>>  No reduce phase.  All entries in the example bucket are json and have
>>>>> an age field.  This initially works correctly, it gets back the matching
>>>>> records as expected.  It also works in curl.  It's an automated test, so
>>>>> each time I run this, it is using a different bucket.  After about a dozen
>>>>> queries, this starts to fail.  It returns an empty result, when it should
>>>>> have found records.  It fails in curl at the same time.
>>>>>  I initially suspected this might have something to do with doing map
>>>>> reduce too soon after writing, and the write not being available on all
>>>>> nodes.  However, I changed the bucket schema entries for w,r,rw,dw from
>>>>> "quorum" to "all", and this still happens (is there another bucket setting I
>>>>> missed?). In addition, I only have 3 nodes (I'm using the dev123 example),
>>>>> and am running curl long enough afterwards.
>>>>>  Here's the strange part that makes me suspicious.  If I make
>>>>> insignificant changes to the query, for example change the double quotes to
>>>>> single quotes, add whitespace or extra parentheses, etc, then it suddenly
>>>>> works again.  It will work on an existing bucket, and on subsequent tests,
>>>>> but again only about a dozen times before it starts failing again. Same
>>>>> behavior in curl.  This makes me suspect that the server is doing some
>>>>> incorrect caching around this js function, based on the function string.
>>>>>  Any explanation about what's going on?
>>>>>  Keith
>>>>>  _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>
>>>>>     --
>>>>
>>>> [image: line][image: logo] <http://mahalostudio.com>
>>>>
>>>>  *Antonio Rohman Fernandez*
>>>> CEO, Founder & Lead Engineer
>>>>
>>>>
>>>> rohman at mahalostudio.com
>>>>
>>>> *Projects*
>>>> MaruBatsu.es <http://marubatsu.es>
>>>>
>>>>
>>>> PupCloud.com <http://pupcloud.com>
>>>> Wedding Album <http://wedding.mahalostudio.com>
>>>>
>>>> [image: line]
>>>>
>>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110331/f85a9854/attachment.html>


More information about the riak-users mailing list