Ownership handoff timed out

Vladyslav Zakhozhai v.zakhozhai at smartweb.com.ua
Thu Oct 29 13:38:07 EDT 2015


Matthew can you describe the bug more detail?

My plan was to migrate to eleveldb and only then to migrate to Riak 2.0. It
seems that I need to change my plans to migrate to Riak 2.0 first. It is
sad.

Is it safe to migrate Riak 1.4.12/Riak CS 1.5.0 to Riak 2.0 on production
environment? According to official upgrade guides I can upgrade nodes one
by one in the same cluster. So Riak 2.0 and Riak 1.4.12 nodes can coexist
in one cluster. Am I right?

Thank you.

On Thu, Oct 29, 2015 at 7:04 PM Matthew Von-Maszewski <matthewv at basho.com>
wrote:

> Sad to say your LOG files suggest the same bug as seen elsewhere and fixed
> by recent changes in the leveldb code.
>
> The tougher issue is that the fixes are currently only available for our
> 2.0 product series.  A backport would be non-trivial due to the number of
> places changed between 1.4 and 2.0 and the number of places the fix
> overlaps those changes.  The corrected code is tagged “2.0.9” in eleveldb
> and leveldb.
>
> The only path readily available to you is to have your receiving cluster
> upgraded to 2.0 Riak CS and manually build/patch eleveldb to the 2.0.9
> version. Then start your handoffs.   (eleveldb version 2.0.9 is not present
> in any shipping version of Riak … yet).
>
> I will write again if I can think of an easier solution.  But nothing is
> occurring to me or the team members I have queried.
>
> Matthew
>
> On Oct 29, 2015, at 12:14 PM, Vladyslav Zakhozhai <
> v.zakhozhai at smartweb.com.ua> wrote:
>
> Hi,
>
> Matthew thank for you answer. eleveldb LOGs are attached.
> Here is LOGs from 2 eleveldb nodes (eggeater was not restarted; what about
> rattlesnake I'm not sure).
>
> On Thu, Oct 29, 2015 at 5:24 PM Matthew Von-Maszewski <matthewv at basho.com>
> wrote:
>
>> Hi,
>>
>> There was a known eleveldb bug with handoff receiving that could cause a
>> timeout.  But it does not sound like bug fits your symptoms.  However, I am
>> willing to verify my diagnosis.  I would need you to gather the LOG files
>> from all vnodes on the RECEIVING side (or at least from the vnode that you
>> are attempting and is failing).
>>
>> I will check it for the symptoms of the known bug.
>>
>> Note:  the LOG files reset on each restart of Riak.  So you must gather
>> the LOG files right after the failure without restarting Riak.
>>
>> Matthew
>>
>>
>> On Oct 29, 2015, at 11:11 AM, Vladyslav Zakhozhai <
>> v.zakhozhai at smartweb.com.ua> wrote:
>>
>> Hi,
>>
>> I want to make small update. Jon your hint about problems on sender side
>> is correct. As I've already told there problems with available resources on
>> sender nodes. There are no enough available RAM which is a cause of
>> swapiness and load on disks. Restarting of sender nodes helps me (at least
>> temoprarily).
>>
>>
>> On Thu, Oct 29, 2015 at 12:19 PM Vladyslav Zakhozhai <
>> v.zakhozhai at smartweb.com.ua> wrote:
>>
>>> Hi,
>>>
>>> Average size of objects in Riak - 300 Kb. This objects are images. This
>>> data updates very very rearly (there almost no updates).
>>>
>>> I have GC turned on and works:
>>> root at python:~# riak-cs-gc status
>>> There is no garbage collection in progress
>>>   The current garbage collection interval is: 900
>>>   The current garbage collection leeway time is: 86400
>>>   Last run started at: 20151029T100600Z
>>>   Next run scheduled for: 20151029T102100Z
>>>
>>> Network misconfigurations were not detected. The result of your script
>>> shows correct info.
>>>
>>> But I see that almost all nodes with bitcask suffers from low free
>>> memory and they swapped. I think that it can be an issue. But my question
>>> is, what workaround is for this problem.
>>>
>>> I've wrote in my first post that I tuned handoff_timeout and
>>> handoff_receive_timeout (now this vaules are 300000 and 600000). But
>>> situation is the same.
>>>
>>>
>>> On Tue, Oct 27, 2015 at 4:06 PM Jon Meredith <jmeredith at basho.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Handoff problems without obvious disk issues can be due to the database
>>>> containing large objects.  Do you frequently update objects in CS, and if
>>>> so have you had garbage collection running?
>>>>
>>>> The timeout is happening on the receiver side after not receiving any
>>>> tcp data for handoff_receive_timeout *milli*seconds.  I know you said you
>>>> increased it, but not how high.  I would bump that up to 300000 to give the
>>>> sender a chance to read larger objects off disk.
>>>>
>>>> To check if the sender is transmitting, on the source node you could run
>>>>   redbug:start("riak_core_handoff_sender:visit_item", [{arity,
>>>> true},{print_file,"/tmp/visit_item.log"},{time, 3600000},{msgs, 1000000}]).
>>>>
>>>> That file should fill fairly fast with an entry for every object the
>>>> sender tries to transmit.
>>>>
>>>> There's a long shot it could be network misconfiguration. Run this from
>>>> the source node having problems
>>>>
>>>> rpc:multicall(erlang, apply, [fun() -> TargetNode = node(),
>>>> [_Name,Host] = string:tokens(atom_to_list(TargetNode), "@"), {ok, Port} =
>>>> riak_core_gen_server:call({riak_core_handoff_listener, TargetNode},
>>>> handoff_port), HandoffIP = riak_core_handoff_listener:get_handoff_ip(),
>>>> TNHandoffIP = case HandoffIP of error -> Host; {ok, "0.0.0.0"} -> Host;
>>>> {ok, Other} -> Other end, {node(), HandoffIP, TNHandoffIP,
>>>> inet:gethostbyname(TNHandoffIP), Port} end, []]).
>>>>
>>>> and it will print out a a list of remote nodes and IP addresses (and
>>>> hopefully an empty list of failed nodes)
>>>>
>>>> {[{'dev1 at 127.0.0.1',          <---- node name
>>>>   {ok,"0.0.0.0"},             <---- handoff ip address configured in
>>>> app.config
>>>>   "127.0.0.1",                <---- hostname passed to socket open
>>>>   {ok,{hostent,"127.0.0.1",[],inet,4,[{127,0,0,1}]}}, <--- DNS entry
>>>> for hostname
>>>>   10019}],                    <---- handoff port
>>>>  []} <--- empty list of errors
>>>>
>>>> Good luck, Jon.
>>>>
>>>> On Tue, Oct 27, 2015 at 3:55 AM Vladyslav Zakhozhai <
>>>> v.zakhozhai at smartweb.com.ua> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Jon thank you for the answer. During approval of my mail to this list
>>>>> I've troubleshoot my issue more deep. And yes, your are right. Neither
>>>>> {error, enotconn} nor max_concurrency is my problem.
>>>>>
>>>>> I'm going to migrate my cluster entierly to eleveldb only, i.e. I need
>>>>> to refuse using bitcask. I have a talk with basho support and they said
>>>>> that it is tricky to tune bitcask on servers with 32 GB RAM (and I guess
>>>>> that it is not tricky, but it is impossible, because bitcask loads all keys
>>>>> in memory regardless of free available RAM). With LevelDB I have
>>>>> opportunity to tune using RAM on servers.
>>>>>
>>>>> So I have 15 nodes with multibackend (bitcask for data and leveldb for
>>>>> metadata). 2 additional servers are without multibackend - only with
>>>>> leveldb. Now I'm not sure do I need still use mutibackend with levedb-only
>>>>> backend.
>>>>>
>>>>> And my problem is (as I mentioned earlier) the following. On
>>>>> leveldb-only nodes I see handoffs timedout and no further progress.
>>>>>
>>>>> On multibackend hosts I have configuration:
>>>>>
>>>>> {riak_kv, [
>>>>>        {add_paths, ["/usr/lib/riak-cs/lib/riak_cs-1.5.0/ebin"]},
>>>>>        {storage_backend, riak_cs_kv_multi_backend},
>>>>>        {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]},
>>>>>        {multi_backend_default, be_default},
>>>>>        {multi_backend, [
>>>>>            {be_default, riak_kv_eleveldb_backend, [
>>>>>                {max_open_files, 50},
>>>>>                {data_root, "/var/lib/riak/leveldb"}
>>>>>            ]},
>>>>>            {be_blocks, riak_kv_bitcask_backend, [
>>>>>                {data_root, "/var/lib/riak/bitcask"}
>>>>>            ]}
>>>>>        ]},
>>>>>
>>>>> And for hosts with leveldb-only backend:
>>>>>
>>>>> {riak_kv, [
>>>>>             {storage_backend, riak_kv_eleveldb_backend},
>>>>> ...
>>>>> {eleveldb, [
>>>>>             {data_root, "/var/lib/riak/leveldb"}
>>>>> (default values for leveldb)
>>>>>
>>>>> In leveldb logs I see nothing that could help me (no errors in logs).
>>>>>
>>>>>
>>>>> On Mon, Oct 26, 2015 at 3:57 PM Jon Meredith <jmeredith at basho.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I suspect your {error,enotconn} messages are unrelated - that's
>>>>>> likely to be caused by an HTTP client closing the connection while Riak
>>>>>> looks up  some networking information about the requestor.
>>>>>>
>>>>>> The max_concurrency message you are seeing is related to the handoff
>>>>>> transfer limit - it should be labelled as informational. When a node has
>>>>>> data to handoff it starts the handoff sender process and if there are
>>>>>> either too many local handoff processes or too many on the remote side it
>>>>>> exits with max_concurrency.  You could increase with riak-admin
>>>>>> transfer-limit but that probably won't help if you're timing out.
>>>>>>
>>>>>> As you're using the multi-backend you're transferring data from
>>>>>> bitcask and leveldb.  The next place I would look is in the leveldb LOG
>>>>>> files to see if there are any leveldb vnodes that are having problems
>>>>>> that's preventing repair.
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>> On Mon, Oct 26, 2015 at 7:15 AM Vladyslav Zakhozhai <
>>>>>> v.zakhozhai at smartweb.com.ua> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I have a problem with persistent timeouts during ownership handoffs.
>>>>>>> I've tried to surf over Internet and current mail list but no success.
>>>>>>>
>>>>>>> I have Riak 1.4.12 cluster with 17 nodes. Almost all nodes use
>>>>>>> multibackend with bitcask and eleveldb as storage backends (we need
>>>>>>> multiple backend for Riak CS 1.5.0 integration).
>>>>>>>
>>>>>>> Now I'm working to migrate Riak cluster to eleveldb as primary and
>>>>>>> only backend. For now I have 2 nodes with eleveldb backend in the same
>>>>>>> cluster.
>>>>>>>
>>>>>>> During ownership handoff process I permanently see errors of timed
>>>>>>> out handoff receivers and sender.
>>>>>>>
>>>>>>> Here is partial output of riak-admin transfers:
>>>>>>> ...
>>>>>>> transfer type: ownership_transfer
>>>>>>> vnode type: riak_kv_vnode
>>>>>>> partition: 331121464707782692405522344912282871640797216768
>>>>>>> started: 2015-10-21 08:32:55 [46.66 min ago]
>>>>>>> last update: no updates seen
>>>>>>> total size: unknown
>>>>>>> objects transferred: unknown
>>>>>>>
>>>>>>>                            unknown
>>>>>>> riak at taipan.pleiad.uaprom  =======>  riak at eggeater.pleiad.uapr
>>>>>>>                                      om
>>>>>>>         |                                           |   0%
>>>>>>>                            unknown
>>>>>>>
>>>>>>> transfer type: ownership_transfer
>>>>>>> vnode type: riak_kv_vnode
>>>>>>> partition: 336830455478606531929755488790080852186328203264
>>>>>>> started: 2015-10-21 08:32:54 [46.68 min ago]
>>>>>>> last update: no updates seen
>>>>>>> total size: unknown
>>>>>>> objects transferred: unknown
>>>>>>> ...
>>>>>>>
>>>>>>> Some of partition handoffs state never updates, some of them
>>>>>>> terminates after partial handoff objects and never starts again.
>>>>>>>
>>>>>>> I see nothing in logs but following:
>>>>>>>
>>>>>>> On receiver side:
>>>>>>>
>>>>>>> 2015-10-21 11:33:55.131 [error]
>>>>>>> <0.25390.1266>@riak_core_handoff_receiver:handle_info:105 Handoff receiver
>>>>>>> for partition 331121464707782692405522344912282871640797216768 timed out
>>>>>>> after processing 0 objects.
>>>>>>>
>>>>>>> On sender side:
>>>>>>>
>>>>>>> 2015-10-21 11:01:58.879 [error] <0.13177.1401> CRASH REPORT Process
>>>>>>> <0.13177.1401> with 0 neighbours crashed with reason: no function clause
>>>>>>> matching webmachine_request:peer_from_peername({error,enotconn},
>>>>>>> {webmachine_request,{wm_reqstate,#Port<0.50978116>,[],undefined,undefined,undefined,{wm_reqdata,...},...}})
>>>>>>> line 150
>>>>>>> 2015-10-21 11:32:50.055 [error] <0.207.0> Supervisor
>>>>>>> riak_core_handoff_sender_sup had child riak_core_handoff_sender started
>>>>>>> with {riak_core_handoff_sender,start_link,undefined} at <0.22312.1090> exit
>>>>>>> with reason max_concurrency in context child_terminated
>>>>>>>
>>>>>>> {error, enotconn} - seems to be network issue. But I have no any
>>>>>>> problems with network. All hosts resolve their neighbors correctly and
>>>>>>> /etc/hosts on each node are correct.
>>>>>>>
>>>>>>> I've tried to increase handoff_timeout and handoff_receive_timeout.
>>>>>>> But no success.
>>>>>>>
>>>>>>> Forcing handoff helped me but for short period of time:
>>>>>>>
>>>>>>> rpc:multicall([node() | nodes()], riak_core_vnode_manager, force_handoffs, []).
>>>>>>>
>>>>>>>
>>>>>>> I see progress of handoffs (riak-admin transfers) but then I see handoff timed out again.
>>>>>>>
>>>>>>>
>>>>>>> A week ago I've joined 4 nodes with bitcask. And there was no such problems.
>>>>>>>
>>>>>>>
>>>>>>> I'm confused a little bit and need to understand my next steps in troubleshooting this issue.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>> riak-users at lists.basho.com
>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>>
>>>>>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>> <eggeater-leveldb-logs-old.tar.gz><rattlesnake-leveldb-logs-old.tar.gz>
> <rattlesnake-leveldb-logs.tar.gz><eggeater-leveldb-logs.tar.gz>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151029/5faf8bd0/attachment-0002.html>


More information about the riak-users mailing list