What is the "riak_kv_nodeq_total" metric?

Dave Brady dbrady at weborama.com
Thu Jan 17 05:34:29 EST 2013


Thanks, Russell!

----- Original Message -----
From: "Russell Brown" <russell.brown at me.com>
To: "Dave Brady" <dbrady at weborama.com>
Cc: "riak-users" <riak-users at lists.basho.com>
Sent: Wednesday, January 16, 2013 9:20:20 PM
Subject: Re: What is the "riak_kv_nodeq_total" metric?

Hi Dave,

On 16 Jan 2013, at 11:29, Dave Brady <dbrady at weborama.com> wrote:

> Greetings, 
> 
> I won't bore everyone with details here: the short story is I ran "riak-admin cluster leave/plan/commit" to remove a node and got a lot of grief from our five-node ring. 
> 
> The ring was pretty well de-stabilized. One-or-more nodes would be down, then up, when repeatedly running "riak-admin ring-status". 
> 
> I have finally isolated a wildly misbehaving node (not the one I was trying to make "leave", by the way). 
> 
> None of the existing metrics I was graphing highlighted a problem, so I went through "/stats" (yet again), looking at the undocumented metrics to see what looked interested. 
> 
> I noticed that riak_kv_vnodeq_total was showing up with a non zero-value, so I set up a graph which plots the difference between the previous-and-current value (like I do for the other "*_total" metrics). 
> 
> The results were *very* interesting! The other four nodes showed occasional values of 1, 2 even 3 once or twice. Our troublesome node showed 152, 8000, 704... !! 
> 
> Does anyone know what riak_kv_vnodeq_total indicates? 

It is the total number of messages in the queues for all the riak_kv_vnodes running on the node. Large queues mean that a/some vnode(s) are not able to keep up with the requests made of it/them. 


Cheers

Russell

> 
> Thanks!
> 
> --
> Dave Brady
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list