Node will not leave cluster

Swinney, Austin Austin at vimeo.com
Wed Jun 6 13:58:49 EDT 2012


I had a similar issue the other day with similar symptoms.  One node just seeming  to be stuck.  It turned out after much frustration looking at the Riak Control interface and the member_status from other nodes, that the node that was leaving didn't known it was supposed to be leaving.  It thought it was in a cluster all by itself.  I had to join it back to the actual cluster.  After that it was fine and started rebalancing.  Then issued the leave command again and it complied.

But to be safe, try looking at member_status on the node that is supposed to be leaving and make sure it actually is leaving.  I got the impression that the other nodes that were actually still in the cluster were reporting back their last known state for the leaving node.  But it wasn't the actual state ON the node itself.




On Jun 6, 2012, at 12:50 PM, David Greenstein wrote:

> 
> Thanks Austin.
> 
> Interesting… after running the ring_status command it seems there are several handoffs that are either hung or not completing. The database is fairly small and running the ring_status command over several minutes reveals the same output.
> 
> The member_status is 25% for ring and pending for the node leaving.
> 
> So, I think something is wrong with the handoff, but, as far as I can tell, there's no indication as to what is wrong in the logs or in the output of these commands.
> 
> ============================== Ownership Handoff ==============================
> Owner:      riak at 10.0.1.14
> Next Owner: riak at 10.0.1.15
> 
> Index: 479555224749202520035584085735030365824602865664
>  Waiting on: []
>  Complete:   [riak_kv_vnode,riak_pipe_vnode]
> 
> Index: 1027618338748291114361965898003636498195577569280
>  Waiting on: []
>  Complete:   [riak_kv_vnode,riak_pipe_vnode]
> 
> Index: 1301649895747835411525156804137939564381064921088
>  Waiting on: []
>  Complete:   [riak_kv_vnode,riak_pipe_vnode]
> 
> -------------------------------------------------------------------------------
> Owner:      riak at 10.0.1.15
> Next Owner: riak at 10.0.1.14
> 
> Index: 913438523331814323877303020447676887284957839360
>  Waiting on: []
>  Complete:   [riak_kv_vnode,riak_pipe_vnode]
> 
> -------------------------------------------------------------------------------
> Owner:      riak at 10.0.1.16
> Next Owner: riak at 10.0.1.15
> 
> Index: 936274486415109681974235595958868809467081785344
>  Waiting on: []
>  Complete:   [riak_kv_vnode,riak_pipe_vnode]
> 
> -------------------------------------------------------------------------------
> 
> 
> ============================== Unreachable Nodes ==============================
> All nodes are up and reachable
> 
> [root at ip-10-0-1-20 ec2-user]# /db/riak/bin/riak-admin member_status
> ================================= Membership ==================================
> Status     Ring    Pending    Node
> -------------------------------------------------------------------------------
> leaving    25.0%     25.0%    'riak at 10.0.1.20'
> valid      28.1%     25.0%    'riak at 10.0.1.14'
> valid      20.3%     25.0%    'riak at 10.0.1.15'
> valid      26.6%     25.0%    'riak at 10.0.1.16'
> -------------------------------------------------------------------------------
> Valid:3 / Leaving:1 / Exiting:0 / Joining:0 / Down:0
> 
> 
> 
> 
> 
> Any other clues would be appreciated. Thank you!
> 
> Dave
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list