simulating physical node crash

Kelly McLaughlin kelly at basho.com
Fri Nov 18 18:51:18 EST 2011


Francisco,

The problem you are experiencing is not due to search, but seems likely due to the way in which the nodes of your cluster have been assigned the partitions of the ring. If only one node has failed and you are getting that no_candidate_nodes error it means that the preference list of at least one of the mapreduce inputs is comprised entirely of partitions on that one downed node. In other words, the three replicas of some of your data have all ended up on the same physical node. This is an unusual circumstance so it would be good to examine the conents of your ring so we can verify this is the case. If you could use riak attach to get a console on one of your nodes and run the following command and put the output in a gist or paste bin, it will hopefully shed light on the problem.

	io:format("~p~n", [riak_core_ring_manager:get_my_ring()]).

The problem Martin described is slightly different. The problem he observed appears to be that the erlang processes to handle the work of the map phase were dying. Since he described the cluster as being under heavy load when this happened I suspect it is related to hitting a resource limit in the erlang vm or the operating system. My suggestion in this case is to use riak 1.0.2 and the pipe mapreduce system. It is much more robust for clusters under heavy load and will provide a better experience. Hope that helps.

Kelly


On Nov 17, 2011, at 4:10 PM, francisco treacy wrote:

> This morning one node went down (3-node 0.14 cluster) and I started getting the dreaded `no_candidate_nodes,exhausted_prefist` error posted earlier.
> 
> If 2 nodes are remaining, and I always use N=3 R=1 ... why is it failing? Something to do with my use of Search?
> 
> Thanks
> Francisco
> 
> 
> 2011/9/28 Martin Woods <mw2134 at gmail.com>
> Hi Francisco
> 
> I've seen the same error in a dev environment running on a single Riak node with an n_val of 1, so in my case it was nothing to do with a failing node. I wasn't running Riak Search either. I posted a question about it to this list a week or so ago but haven't seen a reply yet. 
> 
> So indeed, does anyone know what's causing this error and how we can avoid it?
> 
> Regards,
> Martin. 
> 
> 
> 
> On 28 Sep 2011, at 20:39, francisco treacy <francisco.treacy at gmail.com> wrote:
> 
>> Regarding (3) I found a Forcing Read Repair contrib function (http://contrib.basho.com/bucket_inspector.html) which should help.
>> 
>> Otherwise for the m/r error, all of my buckets use default n_val and write quorum. Could it be that some data never reached that particular node in the cluster? That is, should've I used W=3?  During the failure, many assets were returning 404s which triggered read-repair (and were ok upon subsequent request), but no luck with the Map/Reduce function (it kept on failing).  Could it have something to do with Riak Search?
>> 
>> Thanks,
>> 
>> Francisco
>> 
>> 
>> 2011/9/26 francisco treacy <francisco.treacy at gmail.com>
>> Hi all,
>> 
>> I have a 3-node Riak cluster, and I am simulating the scenario of physical nodes crashing.
>> 
>> When 2 nodes go down, and I query the remaining one, it fails with:
>> 
>> {error,
>>     {exit,
>>         {{{error,
>>               {no_candidate_nodes,exhausted_prefist,
>>                   [{riak_kv_mapred_planner,claim_keys,3},
>>                    {riak_kv_map_phase,schedule_input,5},
>>                    {riak_kv_map_phase,handle_input,3},
>>                    {luke_phase,executing,3},
>>                    {gen_fsm,handle_msg,7},
>>                    {proc_lib,init_p_do_apply,3}],
>>                   []}},
>>           {gen_fsm,sync_send_event,
>>               [<0.31566.2330>,
>>                {inputs,
>> 
>> (...)
>> 
>> Here I'm doing a M/R, inputs being fed by Search.
>> 
>> (1) All of the involved buckets have N=3, and all involved requests R=1 (I don't really need quorum for this usecase)
>> 
>> Why is it failing? I'm sure i'm missing something basic here
>> 
>> (2) Probably worth noting, those 3 nodes are spread across *two* physical servers (1 on small one, 2 on beefier one). I've heard it is "not a good idea", not sure why though. These two servers are definitely enough still for our current load; should I consider adding a third one?
>> 
>> (3) To overcome the aforementioned error, I added a new node to the cluster (installed on the small server). Now the setup looks like: 4 nodes = 2 on small server, 2 on beefier one.
>> 
>> When 2 nodes go down, this works.  Which brings me to another topic... could you point me to good strategies to "pre-" invoke read-repair? Is it up to clients to scan the keyspace forcing reads?  It's a disaster usability-wise when first users start getting 404s all over the place.
>> 
>> Francisco
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111118/191e967c/attachment.html>


More information about the riak-users mailing list