Client get 503 response from riak server

Christopher Meiklejohn cmeiklejohn at basho.com
Wed Mar 4 15:30:54 EST 2015


> On Mar 4, 2015, at 2:22 PM, Bora Kou <bkou at zendesk.com> wrote:
> 
> Hi,
> 
> We have a 6 nodes riak cluster configured with 256 size.  We are using versions 1.4.8. Occasionally, we see the server return 503 response.  Looking at the log, it seems that we have a lot of warning about reading large object and busy_dist_port.  The majority of the objects are less than 2MB but we do have some large objects in the ranch of 5-15MB.
> 
> 
> These are the two type of log entries we consistently see across all nodes when that happen:
> 
> 2015-03-04 17:53:48.149 [warning] <0.1142.0>@riak_kv_vnode:do_get_object:1299 Reading large object of size 5498696 from...
> 
> 2015-03-04 19:06:35.824 [info] <0.95.0>@riak_core_sysmon_handler:handle_event:92 monitor busy_dist_port <0.32259.1162> [{initial_call,{riak_kv_get_fsm,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}] {#Port<0.3353>,'riak at servername...'}
> 
> The doc suggest increasing +zdbbl to deal with busy_dist_port.  But does anybody have any other suggestion to what other parameters I should look at to improve the situation?

Hi Bora,

Given that most of the internode communication of Riak is performed over Distributed Erlang, which uses a single TCP connection between nodes, storing large objects [3] can create throughput problems in the cluster such as head of line blocking [1] and filling up the Erlang’s bounded distribution buffers (messages are stored here waiting to be sent over the TCP connection; any message attempting to perform a TCP send will synchronize on this buffer) [2]. 

Increasing zdbbl increases the size of this buffer, which, if you’re system has the available memory is a smart choice.  Carefully balance increasing the distribution buffer size with available system memory, given you don’t want to overflow into virtual memory, which will further slow down your system.

If you’re going to be storing objects greater than 1 MB, we recommend using Riak CS.

- Christopher

[1] http://en.wikipedia.org/wiki/Head-of-line_blocking
[2] http://docs.basho.com/riak/latest/community/faqs/logs/#riak-logs-have-busy_dist_port-messages
[3] http://stackoverflow.com/questions/24389183/riak-returns-an-error-reading-large-object-of-size-when-a-mapreduce

Christopher Meiklejohn
Senior Software Engineer
Basho Technologies, Inc.
cmeiklejohn at basho.com



More information about the riak-users mailing list