Riak Intermittent Read Failures
sean.mcevoy at gmail.com
Mon Jun 26 05:59:45 EDT 2017
I've observed timeouts too but always on serach operation, you might have
seen my thread "Solr search response time spikes".
I'm getting stats by polling this every minute:
The 99 & 100% response times are most interesting for debugging our
What client & timeout value are you using? I'm using the erlang client
where the default timeout is 60 seconds, but I've over ridden that and am
using 2 seconds.
Interestingly, over the weekend I've started to see a few put & get
timeouts on the application side, but the longest 100% response time is
just under a second which points to a network delay.
I'd start by polling these stats and then examining when you get an
application side timeout. Maybe check the size stats too, if you can catch
which key the operation timed out on it'd be worth checking the object size
& sibling count for it. If nothing else this would eliminate the
possibility that it's unique to a particular object.
On Sat, Jun 24, 2017 at 12:57 PM, markrthomas <mark.thomas at equifax.com>
> I'm getting intermiitent read failures in my cluster, i.e. timeout
> Sometimes an object returns immediately.
> Other times, nothing at all and I get a read-timeout.
> Any ideas on where I start debugging this issue?
> View this message in context: http://riak-users.197444.n3.
> Sent from the Riak Users mailing list archive at Nabble.com.
> riak-users mailing list
> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users