Put failure: too many siblings

Vladyslav Zakhozhai v.zakhozhai at smartweb.com.ua
Fri Jun 17 04:45:06 EDT 2016


Hi Russel,

thank you for your answer. I really appreciate your help.

2.1.3 is not actually riak_kv version. It is version of basho's riak
package. Versions of riak subsystems you can see below.

Bucket properties:
# riak-admin bucket-type list
default (active)

# riak-admin bucket-type status default
default is active

allow_mult: true
basic_quorum: false
big_vclock: 50
chash_keyfun: {riak_core_util,chash_std_keyfun}
dvv_enabled: false
dw: quorum
last_write_wins: false
linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
n_val: 3
notfound_ok: true
old_vclock: 86400
postcommit: []
pr: 0
precommit: []
pw: 0
r: quorum
rw: quorum
small_vclock: 50
w: quorum
write_once: false
young_vclock: 20

I did not mentioned that upgrade from riak 1.5.4 have been took place
couple months ago (about 6 months). As I understand DVV is disabled. Is it
safe to migrate to setting DVV from Vector Clocks?

Package versions:
# dpkg -l | grep riak
ii  riak                                2.1.3-1
 amd64        Riak is a distributed data store
ii  riak-cs                             2.1.0-1
 amd64        Riak CS

Subsystems versions:
"clique_version" : "0.3.2-0-ge332c8f",
"bitcask_version" : "1.7.2",
"sys_driver_version" : "2.2",
"riak_core_version" : "2.1.5-0-gb02ab53",
"riak_kv_version" : "2.1.2-0-gf969bba",
"riak_pipe_version" : "2.1.1-0-gb1ac2cf",
"cluster_info_version" : "2.0.3-0-g76c73fc",
"riak_auth_mods_version" : "2.1.0-0-g31b8b30",
"erlydtl_version" : "0.7.0",
"os_mon_version" : "2.2.13",
"inets_version" : "5.9.6",
"erlang_js_version" : "1.3.0-0-g07467d8",
"riak_control_version" : "2.1.2-0-gab3f924",
"xmerl_version" : "1.3.4",
"protobuffs_version" : "0.8.1p5-0-gf88fc3c",
"riak_sysmon_version" : "2.0.0",
"compiler_version" : "4.9.3",
"eleveldb_version" : "2.1.10-0-g0537ca9",
"lager_version" : "2.1.1",
"sasl_version" : "2.3.3",
"riak_dt_version" : "2.1.1-0-ga2986bc",
"runtime_tools_version" : "1.8.12",
"yokozuna_version" : "2.1.2-0-g3520d11",
"riak_search_version" : "2.1.1-0-gffe2113",
"sys_system_version" : "Erlang R16B02_basho8 (erts-5.10.3) [source]
[64-bit] [smp:4:4] [async-threads:64] [kernel-poll:true] [frame-pointer]",
"basho_stats_version" : "1.0.3",
"crypto_version" : "3.1",
"merge_index_version" : "2.0.1-0-g0c8f77c",
"kernel_version" : "2.16.3",
"stdlib_version" : "1.19.3",
"riak_pb_version" : "2.1.0.2-0-g620bc70",
"syntax_tools_version" : "1.6.11",
"goldrush_version" : "0.1.7",
"ibrowse_version" : "4.0.2",
"mochiweb_version" : "2.9.0",
"exometer_core_version" : "1.0.0-basho2-0-gb47a5d6",
"ssl_version" : "5.3.1",
"public_key_version" : "0.20",
"pbkdf2_version" : "2.0.0-0-g7076584",
"sidejob_version" : "2.0.0-0-gc5aabba",
"webmachine_version" : "1.10.8-0-g7677c24",
"poolboy_version" : "0.8.1p3-0-g8bb45fb",
"riak_api_version" : "2.1.2-0-gd8d510f",
"asn1_version" : "2.0.3",


On Fri, Jun 17, 2016 at 10:45 AM Russell Brown <russell.brown at me.com> wrote:

> What version of riak_kv is behind this riak_cs install, please? Is it
> really 2.1.3 as stated below? This looks and sounds like sibling explosion,
> which is fixed in riak 2.0 and above. Are you sure that you are using the
> DVV enabled setting for riak_cs bucket properties? Can you post your bucket
> properties please?
>
> On 16 Jun 2016, at 23:54, Vladyslav Zakhozhai <v.zakhozhai at smartweb.com.ua>
> wrote:
>
> > Hello.
> >
> > I see very interesting and confusing thing.
> >
> > From my previous letter you can see that siblings count on manifest
> objects is about 100 (actualy it is in range 100-300). Unfortunately my
> problem is that almost all PUT requests are failing with 500 Internal
> Server error.
> >
> > I've tried today set max_siblings riak option to 500. And there were
> successfull PUT requests but not for long. Now I see in riak logs error
> with "max siblings", but actual count of them is 500+ (earlier it was
> 100-300 as I've mentioned).
> >
> > Period of time between max_siblings=500 and errors in log is about 30
> minutes. And I want to point your attention that I've forbid PUT method on
> haproxy - frontend for riak cs.
> >
> >
> >
> > On Mon, Jun 6, 2016 at 1:17 AM Vladyslav Zakhozhai <
> v.zakhozhai at smartweb.com.ua> wrote:
> > Hi, Luke.
> >
> > Thank you for your answer. I did not understand you completely about
> transfer-limit. How does it relate to my problem. Transfer limit - is a
> limit of concurrent data transfer from different nodes. Am I wright? You
> mean that riak can handoff one partition from several nodes concurrently?
> >
> > Now I have transfer-limit 1 on all riak nodes.
> >
> > But I am not sure that my cluster will be converged ever. All nodes
> experiences low memory and are killed by OOM Killer periodically. I try to
> add new nodes to the cluster but due problem with OOM killer this process
> is very-very slow.
> >
> > In the official docs I've read:
> >
> > "Sibling explosion occurs when an object rapidly collects siblings that
> are not reconciled. This can lead to a variety of problems, including
> degraded performance, especially if many objects in a cluster suffer from
> siblings explosion. At the extreme, having an enormous object in a node can
> cause reads of that object to crash the entire node. Other issues include
> undue latency and out-of-memory errors."
> >
> > I mentioned that new nodes in the cluster do not experience such
> problems (I mean out of RAM).
> >
> > Regarding to siblings maybe your are right, this is manifest object. I
> can recognize key name but not bucket name. But more than 100 siblings on
> many keys is really confused me. Each time I try to PUT some object to Riak
> via Riak CS S3 interface I got errors with siblings.
> >
> > On Fri, Jun 3, 2016 at 6:43 PM Luke Bakken <lbakken at basho.com> wrote:
> > Hi Vladyslav,
> >
> > If you recognize the full name of the object raising the sibling
> > warning, it is most likely a manifest object. Sometimes, during hinted
> > handoff, you can see these messages. They should resolve after handoff
> > completes.
> >
> > Please see the documentation for the transfer-limit command as well:
> >
> >
> http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit
> >
> > --
> > Luke Bakken
> > Engineer
> > lbakken at basho.com
> >
> >
> > On Fri, Jun 3, 2016 at 2:55 AM, Vladyslav Zakhozhai
> > <v.zakhozhai at smartweb.com.ua> wrote:
> > > Hi.
> > >
> > > I have a trouble with PUT to Riak CS cluster. During this process I
> > > periodically see the following message in Riak error.log:
> > >
> > > 2016-06-03 11:15:55.201 [error]
> > > <0.15536.142>@riak_kv_vnode:encode_and_put:2253 Put failure: too many
> > > siblings for object OBJECT_NAME (101)
> > >
> > > and also
> > >
> > > 2016-06-03 12:41:50.678 [error]
> > > <0.20448.515>@riak_api_pb_server:handle_info:331 Unrecognized message
> > > {7345880,{error,{too_many_siblings,101}}}
> > >
> > > Here OBJECT_NAME - is the name of object in Riak which has too many
> > > siblings.
> > >
> > > I definitely sure that this objects are static. Nobody deletes is,
> nobody
> > > rewrites it. I have no idea why more than 100 siblings of this object
> > > occurs.
> > >
> > > The following effect of this issue occurs:
> > >
> > > Great amount of keys are loaded to RAM. I almost out of RAM (Do each
> sibling
> > > has it own key or key duplicate?).
> > > Nodes are slow - adding new nodes are too slow
> > > Presence of "too many siblings" affects ownership handoffs
> > >
> > > So I have several questions:
> > >
> > > Do hinted or ownership handoffs can affect siblings count (I mean can
> > > siblings be created during ownership of hinted handoffs)
> > > Is there any workaround of this issue. Do I need remove siblings
> manually or
> > > it removes during merges, read repairs and so on
> > >
> > >
> > > My configuration:
> > >
> > > riak from basho's packages - 2.1.3-1
> > > riak cs from basho's packages - 2.1.0-1
> > > 24 riak/riak-cs nodes
> > > 32 GB RAM per node
> > > AAE is disabled
> > >
> > >
> > > I appreciate you help.
> > >
> > > _______________________________________________
> > > riak-users mailing list
> > > riak-users at lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160617/4969805e/attachment-0002.html>


More information about the riak-users mailing list