Socket closed during write?

Brad Heller brad at cloudability.com
Thu Jan 10 12:58:55 EST 2013


All nodes, but one at a time. It seems to be whichever one is servicing the request.

On Jan 10, 2013, at 9:57 AM, Evan Vigil-McClanahan <emcclanahan at basho.com> wrote:

> Does this happen on all of the nodes, or just one?  One client, or all
> of them?  A particular type of request, or just at random?
> 
> On Thu, Jan 10, 2013 at 1:29 AM, Brad Heller <brad at cloudability.com> wrote:
>> I'm using regular ol' ruby-riak client to write to the bucket. It's all pretty much out of the box. There is an ELB in front of the ring with SSL on the ELB and on passed through to the ring, if that's significant.
>> 
>> Nothing special that I can identify. I can attempt to instrument our code further...
>> 
>> On Jan 9, 2013, at 10:05 PM, Evan Vigil-McClanahan <emcclanahan at basho.com> wrote:
>> 
>>> The last time I saw this particular error it was someone on a 64bit
>>> client setting the content length value incorrectly via libcurl.
>>> Requests would work on 64 bit nodes but fail on 32 bit nodes,
>>> presumably because the HTTP client was handing the socket a garbage
>>> value to read, thereby killing it.
>>> 
>>> Timeouts will generally return a different error.  You might be
>>> looking at some TCP or HTTP issue.  How are you connecting?  Is there
>>> anything special about the requests that are failing?
>>> 
>>> On Wed, Jan 9, 2013 at 7:29 PM, Brad Heller <brad at cloudability.com> wrote:
>>>> Hey all,
>>>> 
>>>> I've got a mysterious error with Riak. Certain writes seem to result in an
>>>> interesting error being raised by the coordinating node (rem, I'm assuming
>>>> it's the coordinating node):
>>>> 
>>>> webmachine error: path="/buckets/my_bucket/keys"
>>>> {error,{error,{badmatch,{error,closed}},
>>>> [{webmachine_request,recv_unchunked_body,4,[{file,"src/webmachine_request.erl"},{line,418}]},
>>>> {webmachine_request,do_recv_body,2,[{file,"src/webmachine_request.erl"},{line,377}]},
>>>> {webmachine_request,call,2,[{file,"src/webmachine_request.erl"},{line,149}]},
>>>> {wrq,req_body,1,[{file,"src/wrq.erl"},{line,112}]},
>>>> {riak_kv_wm_object,accept_doc_body,2,[{file,"src/riak_kv_wm_object.erl"},{line,595}]},
>>>> {webmachine_resource,resource_call,3,[{file,"src/webmachine_resource.erl"},{line,169}]},
>>>> {webmachine_resource,do,3,[{file,"src/webmachine_resource.erl"},{line,128}]},
>>>> {webmachine_decision_core,resource_call,1,[{file,"src/webmachine_decision_core.erl"},{line,48}]}]}
>>>> 
>>>> Web machine returns a 500 error when this happens. I saw online that this
>>>> could be the result of a socket being closed out from under the connection
>>>> [1]. Any idea how I can get to the bottom of this? Due to the nature of the
>>>> reproduction steps, I don't have access to the payload causing the crash
>>>> 'till it's actually *in* Riak. Someone in #riak on IRC suggested that this
>>>> could be the result of write timeouts. Ideas?
>>>> 
>>>> Thanks,
>>>> 
>>>> Brad Heller
>>>> 
>>>> 1:
>>>> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-October/006078.html
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>> 





More information about the riak-users mailing list