riak TS max concurrent queries + overload error

Chris.Johnson at vaisala.com Chris.Johnson at vaisala.com
Thu Jul 28 01:10:14 EDT 2016


Hi Cian,

Thank you! I should've mentioned in my initial email that I thought we were experiencing the same bug you called out (in fact the 2nd comment on that github issue is actually from me).

So, what I'm really curious about is whether or not the original "overload" error is happening because we're hitting the limit on TS max concurrent queries or if riak is actually "overloaded" and we shouldn't increase the configuration value for max concurrent queries.

I'd like to know whether or not I should expect a certain value for max concurrent queries to be stable and performant for some given hardware specs. This is an experiment that we will probably run in house to determine a good value, but it would be great to know what range is expected to perform well.

Also, I have no idea if the max concurrent queries setting includes subqueries over multiple quanta. For instance, if I have 4 TS queries hitting a riak node configured for 12 max queries and each query spans 3 - 4 quanta, should i expect an "overload" error?

Thank you for the advice on implementing client backoff! Hopefully, we can do that as well as increase the overall TS query capacity of our cluster with a simple configuration change. I'm suspicious that we have a very conservative value at the moment.

Chris
________________________________________
From: Cian Synnott <cian at emauton.org>
Sent: Wednesday, July 27, 2016 6:03 PM
To: Johnson Chris CJOH
Cc: riak-users at lists.basho.com
Subject: Re: riak TS max concurrent queries + overload error

Hi Chris,

This sounds like the issue described at
  https://github.com/basho/riak_kv/issues/1418

On Wed, Jul 27, 2016 at 11:19 PM,  <Chris.Johnson at vaisala.com> wrote:
> Also, does anyone have any recommendations on query pooling so we can
> guarantee that multiple clients will not generate more queries than the
> cluster can handle?
>
Probably the right thing to do (when the RPC server is fixed) is to
have the clients independently heck for backpressure from Riak (e.g.
overload messages like this), retry with exponential backoff, and have
each retry increment a counter somewhere in your monitoring system to
make that problem visible.

This should allow you to handle overload (somewhat) gracefully,
respond to critical events (e.g. an alert), or to see any overload
trends over time.

Cian




More information about the riak-users mailing list