Whole cluster times out if one node is gone

Jay Adkisson j4yferd at gmail.com
Sat Nov 27 15:21:12 EST 2010


Neville, I'm not sure how you mean.  The network gear is all functional,
otherwise I wouldn't be able to interact with the machines at all (they're
at our colo).  But as far as I understand, if I hard reboot a box (or, in a
real-world scenario, the pdu fails), the switch will happily continue
forwarding packets into nothingness, causing HTTP requests to hang
indefinitely until they time out.  From what Dan said, I would expect that
Riak handles that sort of situation intelligently.  I guess my remaining
questions are:

* How does Riak detect that a node is down, and what could cause that to
take a full minute?
* When N=3, what about a single node failure could cause a read with R=1 to
time out?
* Is there a way to configure the strictness of when nodes are assumed dead?
 I'm thinking like a "timeout" config option or something.

Peace,
--Jay

On Tue, Nov 23, 2010 at 2:55 PM, Neville Burnell
<neville.burnell at gmail.com>wrote:

> Just a thought ... have you verified your switch, cables, nics, etc
>
>
> On 24 November 2010 09:33, Jay Adkisson <j4yferd at gmail.com> wrote:
>
>> (many profuse apologies to Dan - hit "reply" instead of "reply all")
>>
>> Alrighty, I've done a little more digging.  When I throttle the writes
>> heavily (2/sec) and set R and W to 1 all around, the cluster works just fine
>> after I restart the node for about 15-20 seconds.  Then the read request
>> hangs for about a minute, until node D disappears from connected_nodes in
>> riak-admin status, at which point it returns the desired value (although
>> sometimes I get a 503):
>>
>> --2010-11-23 13:*01:28*--  http://<node A>:8098/riak/<bucket>/<key>?r=1
>> Resolving <node A>... <ip addr>
>> Connecting to <node A>|<ip addr>|:8098... connected.
>> HTTP request sent, awaiting response... *<hang...> *200 OK
>> Length: 3684 (3.6K) [image/jpeg]
>> Saving to: `<key>?r=1'
>>
>> 100%[======================================>] 3,684       --.-K/s   in 0s
>>
>> 2010-11-23 13:*02:21* (49.5 MB/s) - `<key>?r=1' saved [3684/3684]
>>
>> --2010-11-23 13:02:23--  http://<node A>:8098/riak/<bucket>/<key>?r=1
>> Resolving <node A>... <ip addr>
>> Connecting to <node A>|<ip addr>|:8098... connected.
>> HTTP request sent, awaiting response... 200 OK
>> Length: 3684 (3.6K) [image/jpeg]
>> Saving to: `<key>?r=1'
>>
>> 100%[======================================>] 3,684       --.-K/s   in 0s
>>
>> 2010-11-23 13:02:23 (220 MB/s) - `<key>?r=1' saved [3684/3684]
>>
>> Afterwards, node D comes back up and re-joins the cluster seamlessly.
>>
>> Any insights?
>>
>> --Jay
>>
>> On Mon, Nov 22, 2010 at 5:59 PM, Jay Adkisson <j4yferd at gmail.com> wrote:
>>
>>> Hey Dan,
>>>
>>> Thanks for the response!  I tried it again while watching `riak-admin
>>> status` - basically, it takes about 30 seconds of node C being down before
>>> riak realizes it's gone.  During that time, if I'm writing to the cluster at
>>> all (I throttled it to 2 writes per second for testing), both writes and
>>> reads hang indefinitely, and sometimes time out.
>>>
>>> I'm using Ripple to do the writes, and wget to test reads, all on node A
>>> for now, since I know it'll be up.  I'm using the default R and W options
>>> for now.
>>>
>>> Thanks for the help and clarification around ringready.
>>>
>>> --Jay
>>>
>>>
>>> On Mon, Nov 22, 2010 at 5:15 PM, Dan Reverri <dan at basho.com> wrote:
>>>
>>>> Your HTTP calls should not being timing out. Are you sending requests
>>>> directly to the Riak node or are you using a load balancer? How much load
>>>> are you placing on node A? Is it a write only load or are there reads as
>>>> well? Can you confirm "all" requests time out or is it a large subset of the
>>>> requests? How large are the objects being written? Are you setting R and W
>>>> in the request? Are you using a particular client (Ruby, Python, etc.)? Can
>>>> you provide the output of "riak-admin status" from node A?
>>>>
>>>> Regarding the ringready command; that is behaving as I would expect
>>>> considering a node is down.
>>>>
>>>> Thanks,
>>>> Dan
>>>>
>>>> Daniel Reverri
>>>> Developer Advocate
>>>> Basho Technologies, Inc.
>>>> dan at basho.com
>>>>
>>>>
>>>> On Mon, Nov 22, 2010 at 4:55 PM, Jay Adkisson <j4yferd at gmail.com>wrote:
>>>>
>>>>> Hey all,
>>>>>
>>>>> Here's what I'm seeing: I have four nodes A, B, C, and D.  I'm loading
>>>>> lots of data into node A, which is being distributed evenly across the
>>>>> nodes.  If I physically reboot node D, all my HTTP calls time out, and
>>>>> `riak-admin ringready` complains that not all nodes are up.  Is this
>>>>> intended behavior?  Is there a configuration option I can set so it fails
>>>>> more gracefully?
>>>>>
>>>>> --Jay
>>>>>
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101127/7dab9294/attachment.html>


More information about the riak-users mailing list