have a new node take over the role of a downed, unrecoverable node?

Leonid Riaboshtan perfecthumanorama at gmail.com
Sat Oct 16 21:52:51 EDT 2010


>> However, you'll need to reip on all machines.

Hmm, isn't stuff like that should be treated automaticly by Riak? I mean I
have a cluster where nodes leave, nodes come. And after each come/leave I
need to do something to nodes in entire cluster to entroduce/remove new/old
node and repartion the data?

And question sounds rather strange to me, what is the node role in system
where all nodes are equal? It's everywhere said that  Riak will
automatically re-balance data as nodes join and leave the cluster. It's not
the case when node becomes unreachable and cluster would repartion data to
keep it solid (like keeping n_val for keys)?

Or something else should watch for nodes states and tell cluster that node
is down?

It's also said that:
The ring state is shared around the cluster by means of a "gossip protocol".
Whenever a node changes its claim on the ring, it announces its change via
this protocol. It also periodically re-announces what it knows about the
ring, in case any nodes missed previous updates.

Isn't cluster checking on unavailable nodes that way too?

I'm not offending anyone, just trying to make things more clear for myself.

On Sun, Oct 17, 2010 at 4:56 AM, Jesse Newland <jesse at railsmachine.com>wrote:

> Thanks Sean!
>
> Regards -
>
> Jesse Newland
> ---
> jesse at railsmachine.com
> 404.216.1093
>
> On Oct 16, 2010, at 7:01 PM, Sean Cribbs wrote:
>
> > Sorry, I wasn't completely clear. You can make any node "leave" from the
> console. e.g.
> >
> > riak_core_gossip:remove_from_cluster('riak at some-host.com').
> >
> > Sean Cribbs <sean at basho.com>
> > Developer Advocate
> > Basho Technologies, Inc.
> > http://basho.com/
> >
> > On Oct 16, 2010, at 5:05 PM, Alexander Sicular wrote:
> >
> >> This has come up before. "Leave" is what is currently available and
> >> needs to be run on the node that wants to leave. This, of course,
> >> means the node needs to be available. What you really want is a kick
> >> like "remove" or something that doesn't exist yet, afaik. I think
> >> there is a ticket open.
> >>
> >> -alexander
> >>
> >> On 2010-10-16, Jesse Newland <jesse at railsmachine.com> wrote:
> >>> The description of leave on the wiki mentions that it "causes the node
> to
> >>> leave the cluster it participates in" - I assume "the node" refers to
> the
> >>> node this command is run on? How would I "leave" a node that I can't
> run
> >>> this command on anymore?
> >>>
> >>> Regards -
> >>>
> >>> Jesse Newland
> >>> ---
> >>> jesse at railsmachine.com
> >>> 404.216.1093
> >>>
> >>> On Oct 16, 2010, at 3:16 PM, Sean Cribbs wrote:
> >>>
> >>>> `leave` is exactly what you want to do then.  Once the old node has
> left
> >>>> (use `ringready` to track its exit), add the new neode.
> >>>>
> >>>> If the EBS volume containing the node's data was not lost, you could
> mount
> >>>> it onto the new node to save some recovery time, and then reip.
>  However,
> >>>> you'll need to reip on all machines.
> >>>>
> >>>> Sean Cribbs <sean at basho.com>
> >>>> Developer Advocate
> >>>> Basho Technologies, Inc.
> >>>> http://basho.com/
> >>>>
> >>>> On Oct 16, 2010, at 2:54 PM, Jesse Newland wrote:
> >>>>
> >>>>> I'm running through some disaster scenarios before bringing a riak
> >>>>> cluster into production, and have run into a scenario that I can't
> work
> >>>>> through the proper resolution for just yet:
> >>>>>
> >>>>> Say an ec2 instance that was a part of a ring went away quickly, and
> data
> >>>>> from it was unrecoverable.
> >>>>>
> >>>>> How might I go about telling the rest of the ring that a new instance
> >>>>> that I've brought up should take over the vnodes that were on that
> old
> >>>>> instance? This sounds like a job for `riak-admin reip`, but after
> running
> >>>>> `reip downed_node new_node`, `riak-admin ringready` still shows that
> the
> >>>>> old nodes are a part of the ring and down. I guess what I'd like to
> do is
> >>>>> a posthumeous `leave`?
> >>>>>
> >>>>> Thoughts?
> >>>>>
> >>>>> Regards -
> >>>>>
> >>>>> Jesse Newland
> >>>>> ---
> >>>>> jesse at railsmachine.com
> >>>>> 404.216.1093
> >>>>>
> >>>>> _______________________________________________
> >>>>> riak-users mailing list
> >>>>> riak-users at lists.basho.com
> >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>
> >>>
> >>>
> >>
> >> --
> >> Sent from my mobile device
> >
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101017/9fe599b6/attachment.html>


More information about the riak-users mailing list