Reip(ing) riak node created two copies in the cluster
sharmanitishdutt at gmail.com
Wed May 2 11:05:00 EDT 2012
We have a 12-node Riak cluster. Until now we were naming every new node as riak@<ip_address>. We then decided to rename the all the nodes to riak@<hostname>, which makes troubleshooting easier.
After issuing reip command to two nodes, we noticed in the "status" that those 2 nodes were now appearing in the cluster with the old name as well as the new name. Other nodes were trying to handoff partitions to the "new" nodes, but apparently they were not able to. After this the whole cluster went down and completely stopped responding to any read/write requests.
member_status displayed old Riak name in "legacy" mode. Since this is our production cluster, we are desperately looking for some quick remedies. Issuing "force-remove" to the old names, restarting all the nodes, changing the riak names back to the old ones - none of it helped.
Currently, we are hosting limited amount of data. Whats an elegant way to recover from this mess? Would shutting off all the nodes, deleting the ring directory, and again forming the cluster work?
More information about the riak-users