riak_core questions

Justin Sheehy justin at basho.com
Thu Jul 28 09:20:53 EDT 2011


Hi, Dmitry.

A couple of suggestions...

The reason that you're not seeing an easy way to automatically have nodes be added or removed from the cluster upon going down or coming up is that we recommend strongly against such behavior.

The idea is that intentional (administrative) outages are very different in nature from unintentional and potentially transitory outages. We have explicit administrative commands such as "join" and "leave" for the administrative cases, making it very easy to add or remove hosts to a cluster. When a node is unreachable, you often can't automatically tell whether it is a host problem or a network problem and can't automatically tell if it is a long-term or short-term outage. This is why mechanisms such as quorums and hinted handoff exist: to ensure proper operation of the cluster as a whole throughout such outages. Consider the case where you have a network problem such that several of your nodes lose visibility to each other for brief and distinct periods of time. If nodes are auto-added and auto-removed then you will have quite a bit of churn and potentially a very harmful feedback scenario. Instead of auto-adding and auto-removing, consider using things like 
riak_core_node_watcher to decide which nodes to interact with on a per-operation basis.

I'm also not sure what you mean by "if the master node goes down" since in most riak_core applications there is no master node. Of course you can create such a mechanism if you need it, but (e.g.) Riak KV and the accompanying applications do not have any notion of a master node and thus do not have any such concern.

I hope that this is useful.

Best regards,

-Justin






More information about the riak-users mailing list