Put failure: too many siblings

Vladyslav Zakhozhai v.zakhozhai at smartweb.com.ua
Thu Jun 16 18:54:36 EDT 2016


Hello.

I see very interesting and confusing thing.

>From my previous letter you can see that siblings count on manifest objects
is about 100 (actualy it is in range 100-300). Unfortunately my problem is
that almost all PUT requests are failing with 500 Internal Server error.

I've tried today set max_siblings riak option to 500. And there were
successfull PUT requests but not for long. Now I see in riak logs error
with "max siblings", but actual count of them is 500+ (earlier it was
100-300 as I've mentioned).

Period of time between max_siblings=500 and errors in log is about 30
minutes. And I want to point your attention that I've forbid PUT method on
haproxy - frontend for riak cs.



On Mon, Jun 6, 2016 at 1:17 AM Vladyslav Zakhozhai <
v.zakhozhai at smartweb.com.ua> wrote:

> Hi, Luke.
>
> Thank you for your answer. I did not understand you completely about
> transfer-limit. How does it relate to my problem. Transfer limit - is a
> limit of concurrent data transfer from different nodes. Am I wright? You
> mean that riak can handoff one partition from several nodes concurrently?
>
> Now I have transfer-limit 1 on all riak nodes.
>
> But I am not sure that my cluster will be converged ever. All nodes
> experiences low memory and are killed by OOM Killer periodically. I try to
> add new nodes to the cluster but due problem with OOM killer this process
> is very-very slow.
>
> In the official docs I've read:
>
> "Sibling explosion occurs when an object rapidly collects siblings that
> are not reconciled. This can lead to a variety of problems, including
> degraded performance, especially if many objects in a cluster suffer from
> siblings explosion. At the extreme, having an enormous object in a node can
> cause reads of that object to crash the entire node. Other issues include undue
> latency
> <http://docs.basho.com/riak/kv/2.1.4/using/performance/latency-reduction> and
> out-of-memory errors."
>
> I mentioned that new nodes in the cluster do not experience such problems
> (I mean out of RAM).
>
> Regarding to siblings maybe your are right, this is manifest object. I can
> recognize key name but not bucket name. But more than 100 siblings on many
> keys is really confused me. Each time I try to PUT some object to Riak via
> Riak CS S3 interface I got errors with siblings.
>
> On Fri, Jun 3, 2016 at 6:43 PM Luke Bakken <lbakken at basho.com> wrote:
>
>> Hi Vladyslav,
>>
>> If you recognize the full name of the object raising the sibling
>> warning, it is most likely a manifest object. Sometimes, during hinted
>> handoff, you can see these messages. They should resolve after handoff
>> completes.
>>
>> Please see the documentation for the transfer-limit command as well:
>>
>> http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit
>>
>> --
>> Luke Bakken
>> Engineer
>> lbakken at basho.com
>>
>>
>> On Fri, Jun 3, 2016 at 2:55 AM, Vladyslav Zakhozhai
>> <v.zakhozhai at smartweb.com.ua> wrote:
>> > Hi.
>> >
>> > I have a trouble with PUT to Riak CS cluster. During this process I
>> > periodically see the following message in Riak error.log:
>> >
>> > 2016-06-03 11:15:55.201 [error]
>> > <0.15536.142>@riak_kv_vnode:encode_and_put:2253 Put failure: too many
>> > siblings for object OBJECT_NAME (101)
>> >
>> > and also
>> >
>> > 2016-06-03 12:41:50.678 [error]
>> > <0.20448.515>@riak_api_pb_server:handle_info:331 Unrecognized message
>> > {7345880,{error,{too_many_siblings,101}}}
>> >
>> > Here OBJECT_NAME - is the name of object in Riak which has too many
>> > siblings.
>> >
>> > I definitely sure that this objects are static. Nobody deletes is,
>> nobody
>> > rewrites it. I have no idea why more than 100 siblings of this object
>> > occurs.
>> >
>> > The following effect of this issue occurs:
>> >
>> > Great amount of keys are loaded to RAM. I almost out of RAM (Do each
>> sibling
>> > has it own key or key duplicate?).
>> > Nodes are slow - adding new nodes are too slow
>> > Presence of "too many siblings" affects ownership handoffs
>> >
>> > So I have several questions:
>> >
>> > Do hinted or ownership handoffs can affect siblings count (I mean can
>> > siblings be created during ownership of hinted handoffs)
>> > Is there any workaround of this issue. Do I need remove siblings
>> manually or
>> > it removes during merges, read repairs and so on
>> >
>> >
>> > My configuration:
>> >
>> > riak from basho's packages - 2.1.3-1
>> > riak cs from basho's packages - 2.1.0-1
>> > 24 riak/riak-cs nodes
>> > 32 GB RAM per node
>> > AAE is disabled
>> >
>> >
>> > I appreciate you help.
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160616/c8728fb4/attachment-0002.html>


More information about the riak-users mailing list