What is the "riak_kv_nodeq_total" metric?
dbrady at weborama.com
Wed Jan 16 13:29:22 EST 2013
I won't bore everyone with details here: the short story is I ran "riak-admin cluster leave/plan/commit" to remove a node and got a lot of grief from our five-node ring.
The ring was pretty well de-stabilized. One-or-more nodes would be down, then up, when repeatedly running "riak-admin ring-status".
I have finally isolated a wildly misbehaving node (not the one I was trying to make "leave", by the way).
None of the existing metrics I was graphing highlighted a problem, so I went through "/stats" (yet again), looking at the undocumented metrics to see what looked interested.
I noticed that riak_kv_vnodeq_total was showing up with a non zero-value, so I set up a graph which plots the difference between the previous-and-current value (like I do for the other "*_total" metrics).
The results were *very* interesting! The other four nodes showed occasional values of 1, 2 even 3 once or twice. Our troublesome node showed 152, 8000, 704... !!
Does anyone know what riak_kv_vnodeq_total indicates?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users