Riak CS/Stanchion troubleshooting (Retrieval of user record)

Vladyslav Zakhozhai v.zakhozhai at smartweb.com.ua
Fri Nov 13 12:04:59 EST 2015


Hello.

I have Riak CS cluster with 18 nodes. On each node there is Riak CS and
Riak service and one Stanchion node.

Versions:
Riak 1.4.12
Riak CS 1.5.0
Stanchion 1.5.0

Riak CS and Riak allocated behind HAProxy balancers:

WAN -> HAProxy -> Riak CS nodes -> HAProxy -> Riak nodes.
ans
Stanchion -> HAProxy -> Riak

Today due a spike of traffic load (about 1000 rps) on the cluster 50% of
Riak CS returned HTTP 500 and 503 (querying /riak-cs/ping resource also was
not successful).

In Riak CS logs I've seen the following messages:

2015-11-13 13:13:09.514 [error]
<0.11264.1387>@riak_cs_wm_common:maybe_create_user:222 Retrieval of user
record for s3 failed. Reason: disconnected

In Riak CS logs I see the following:
2015-11-13 17:31:52.995 [error] <0.11254.6534> Lager event handler
error_logger_lager_h exited with reason
{'EXIT',{{badmatch,["/buckets/uaprom-image/objects/272547384_cid1322007_pid183135512-26a7c1f3.jpg",{error,{error,{badmatch,{error,closed}},[{webmachine_request,recv_unchunked_body,3,[{file,"src/webmachine_request.erl"},{line,471}]},{webmachine_request,call,2,[{file,"src/webmachine_request.erl"},{line,193}]},{wrq,stream_req_body,2,[{file,"src/wrq.erl"},{line,121}]},{riak_cs_wm_object,handle_normal_put,2,[{file,"src/riak_cs_wm_object.erl"},{line,341}]},{riak_cs_wm_common,accept_body,2,[{file,...},...]},...]}},...]},...}}

I suspect that there were problem between Riak CS - Stanhion or Stanhion -
Riak. I have no clear idea in Stanchion troubleshooting. The main reason is
the following. Stanhion works fine, service is up (answers on ping
command). But it is very laconic: there is almost nothing in console and
error logs (even with debug log level).

Riak CS adds, removes, gets properties through Stanchion service. Am I
right? I can't exactly understand where is my bottleneck - Riak, Riak CS or
Stanhion.

When we need authenticated access for reading object from bucket do we need
Stanchion? If not I can't understand why I had a lot of error during
getting objects from Riak CS.

Thank you in advance.

P. S. Sometimes when there is some issues with Riak CS - Stanchion
connectivity I need to restart Riak CS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151113/aa79c926/attachment-0002.html>


More information about the riak-users mailing list