Riak CS/Stanchion troubleshooting (Retrieval of user record)

Kazuhiro Suzuki kaz at basho.com
Sun Nov 15 21:01:17 EST 2015


Hi,

ha_proxy's timeout settings often causes disconnected errors on a Riak
CS deployment by high work load. termination_stat [1] in tcplog [2]
lets you know if timeout happens or not.

> 2015-11-13 13:13:09.514 [error] <0.11264.1387>@riak_cs_wm_common:maybe_create_user:222 Retrieval of user record for s3 failed. Reason: disconnected

This means Riak CS failed to read a user data from Riak for
authentication due to a disconnected error.

> Riak CS adds, removes, gets properties through Stanchion service. Am I right? I can't exactly understand where is my bottleneck - Riak, Riak CS or Stanchion.

Mainly Stanchion is only used to update/delete data of users and
buckets. To inspect a node, Riak S2/CS 2.1 introduced new metrics
including various latencies and counters, which help to identify
bottleneck.

> When we need authenticated access for reading object from bucket do we need Stanchion? If not I can't understand why I had a lot of error during getting objects from Riak CS.

Authenticated access is always necessary but a read request of user
data for auth is issued from Riak CS to Riak directly, not through
Stanchion.

> P. S. Sometimes when there is some issues with Riak CS - Stanchion connectivity I need to restart Riak CS.

Riak CS 1.5.0 has connection pool leak problem [3]. You might hit the issue...

[1]: https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.5
[2]: https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2.2
[3]: http://docs.basho.com/riakcs/latest/cookbooks/Riak-CS-Release-Notes/#Riak-CS-1-5-2

On Sat, Nov 14, 2015 at 2:04 AM, Vladyslav Zakhozhai
<v.zakhozhai at smartweb.com.ua> wrote:
>
> Hello.
>
> I have Riak CS cluster with 18 nodes. On each node there is Riak CS and Riak
> service and one Stanchion node.
>
> Versions:
> Riak 1.4.12
> Riak CS 1.5.0
> Stanchion 1.5.0
>
> Riak CS and Riak allocated behind HAProxy balancers:
>
> WAN -> HAProxy -> Riak CS nodes -> HAProxy -> Riak nodes.
> ans
> Stanchion -> HAProxy -> Riak
>
> Today due a spike of traffic load (about 1000 rps) on the cluster 50% of
> Riak CS returned HTTP 500 and 503 (querying /riak-cs/ping resource also was
> not successful).
>
> In Riak CS logs I've seen the following messages:
>
> 2015-11-13 13:13:09.514 [error]
> <0.11264.1387>@riak_cs_wm_common:maybe_create_user:222 Retrieval of user
> record for s3 failed. Reason: disconnected
>
> In Riak CS logs I see the following:
> 2015-11-13 17:31:52.995 [error] <0.11254.6534> Lager event handler
> error_logger_lager_h exited with reason
> {'EXIT',{{badmatch,["/buckets/uaprom-image/objects/272547384_cid1322007_pid183135512-26a7c1f3.jpg",{error,{error,{badmatch,{error,closed}},[{webmachine_request,recv_unchunked_body,3,[{file,"src/webmachine_request.erl"},{line,471}]},{webmachine_request,call,2,[{file,"src/webmachine_request.erl"},{line,193}]},{wrq,stream_req_body,2,[{file,"src/wrq.erl"},{line,121}]},{riak_cs_wm_object,handle_normal_put,2,[{file,"src/riak_cs_wm_object.erl"},{line,341}]},{riak_cs_wm_common,accept_body,2,[{file,...},...]},...]}},...]},...}}
>
> I suspect that there were problem between Riak CS - Stanhion or Stanhion -
> Riak. I have no clear idea in Stanchion troubleshooting. The main reason is
> the following. Stanhion works fine, service is up (answers on ping command).
> But it is very laconic: there is almost nothing in console and error logs
> (even with debug log level).
>
> Riak CS adds, removes, gets properties through Stanchion service. Am I
> right? I can't exactly understand where is my bottleneck - Riak, Riak CS or
> Stanhion.
>
> When we need authenticated access for reading object from bucket do we need
> Stanchion? If not I can't understand why I had a lot of error during getting
> objects from Riak CS.
>
> Thank you in advance.
>
> P. S. Sometimes when there is some issues with Riak CS - Stanchion
> connectivity I need to restart Riak CS.
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Kazuhiro Suzuki | Basho Japan KK




More information about the riak-users mailing list