Solr search response time spikes

sean mcevoy sean.mcevoy at gmail.com
Thu Jun 22 10:20:00 EDT 2017


Hi List,

We have a standard riak cluster with 5 nodes and at the minute the traffic
levels are fairly low. Each of our application nodes has 25 client
connections, 5 to each riak node which get selected in a round robin.

Our application level requests involve multiple riak requests so our
traffic tends to make requests in small bursts. Everything works fine for
KV gets, puts & deletes but we're seeing timeouts & weird response time
spikes on solr search operations.

In the past 36 hours (the only period I have riak stats for) I see one
response time of 38.8 seconds, 3 hours earlier a response time of 20.8
seconds, and the third biggest spike is an acceptable 3.5 seconds.

See below all search_query stats for the minute of the 38 sec sample. In
the application request we made 5 riak search requests to the same index in
parallel, which happens for each request of this type and normally doesn't
have an issue. But in this case all 5 timed out, and one timed out again on
retry with the other 4 succeeding.

Anyone ever seen anything like this before? Is there any known deadlock in
solr that I might hit if I make the same request on another connection
before the first has completed? This is what we do when our riak client
times out after 2 seconds and immediately retries.

Any advice or pointers welcomed.
Thanks,
//Sean.


Riak node 1
search_query_throughput_one: 14
search_query_throughput_count: 259
search_query_latency_min: 2776
search_query_latency_median: 69411
search_query_latency_mean: 4900973
search_query_latency_max: 38887902
search_query_latency_999: 38887902
search_query_latency_99: 38887902
search_query_latency_95: 2046215
search_query_fail_one: 0
search_query_fail_count: 0

Riak node 2
search_query_throughput_one: 22
search_query_throughput_count: 564
search_query_latency_min: 4006
search_query_latency_median: 8800
search_query_latency_mean: 11834
search_query_latency_max: 25509
search_query_latency_999: 25509
search_query_latency_99: 25509
search_query_latency_95: 24035
search_query_fail_one: 0
search_query_fail_count: 0

Riak node 3
search_query_throughput_one: 6
search_query_throughput_count: 298
search_query_latency_min: 3200
search_query_latency_median: 15391
search_query_latency_mean: 18062
search_query_latency_max: 31759
search_query_latency_999: 31759
search_query_latency_99: 31759
search_query_latency_95: 31759
search_query_fail_one: 0
search_query_fail_count: 0

Riak node 4
search_query_throughput_one: 8
search_query_throughput_count: 334
search_query_latency_min: 2404
search_query_latency_median: 7230
search_query_latency_mean: 10211
search_query_latency_max: 22502
search_query_latency_999: 22502
search_query_latency_99: 22502
search_query_latency_95: 22502
search_query_fail_one: 0
search_query_fail_count: 0

Riak node 5
search_query_throughput_one: 0
search_query_throughput_count: 0
search_query_latency_min: 0
search_query_latency_median: 0
search_query_latency_mean: 0
search_query_latency_max: 0
search_query_latency_999: 0
search_query_latency_99: 0
search_query_latency_95: 0
search_query_fail_one: 0
search_query_fail_count: 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20170622/860cdea4/attachment-0002.html>


More information about the riak-users mailing list