replacing node results in error with diag
m.vernimmen at comparegroup.eu
Tue Sep 30 15:39:47 EDT 2014
Today I finished upgrading 2.0.0-pre20 to 2.0.0-1. Once that was done I did a node replace according to the instructions at http://docs.basho.com/riak/latest/ops/running/nodes/replacing/
Once the replacing was done, our monitoring notified us about a problem with the cluster. Our monitoring does a 'riak-admin diag' and each of the nodes is now giving the output I've posted here: https://gist.github.com/anonymous/a3133333a07b0cd1da1c
There is a node being referenced in the diag, which is the replaced node. It is no longer in the cluster. I confirmed the ring was settled and in the web interface of the cluster the replaced node is no longer listed neither is it in the `riak-admin status` output. Only a restart of the riak service on each of the nodes resolves the problem. Doing a restart on only one node fixes the diag status only for that node.
To me it seems like there is some state left in the cluster nodes after a node is replaced, causing the `riak-admin diag` command to fail. Has anyone else seen this? Would this classify as a bug or did I simply do something wrong ? :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users