repair-2i stops with "bad argument in call to eleveldb:async_write"

Effenberg, Simon seffenberg at team.mobile.de
Fri Aug 8 03:11:36 EDT 2014


Hi @list,

I send an e-mail yesterday but because of the size (logfile attached) it
has to be moderated.. I will retry a smaller version but maybe some
admin can approve the mail?

Cheers
Simon

On Wed, Aug 06, 2014 at 01:08:36PM +0100, bryan hunt wrote:
> Simon,
> 
> If you want to get more verbose logging information, you could perform the following to change the logging level, to debug, then run `repair-2i`, and finally switching back to the normal logging level.
> 
> - `riak attach`
> - `(riak at nodename)1> SetDebug = fun() -> {node(), lager:set_loglevel(lager_file_backend, "/var/log/riak/console.log", debug)} end.`
> - `(riak at nodename)2> rp(rpc:multicall(erlang, apply, [SetDebug,[]])).`
> (don't forget the period at the end of these statements)
> - Hit CTRL+C twice to quit from the node
> 
> You can then revert back to the normal `info` logging level by running the following command via `riak attach`:
> 
> - `riak attach`
> - `(riak at nodename)1> SetInfo = fun() -> {node(), lager:set_loglevel(lager_file_backend, "/var/log/riak/console.log", info)} end.`
> - `(riak at nodename)2> rp(rpc:multicall(erlang, apply, [SetInfo,[]])).`
> (don't forget the period at the end of these statements)
> - Hit CTRL+C twice to quit from a the node
> 
> Please also see the docs for info on `riak attach` monitoring of repairs.
> 
> http://docs.basho.com/riak/1.4.9/ops/running/recovery/repairing-partitions/#Monitoring-Repairs
> 
> Repairs can also be monitored using the `riak-admin transfers` command.
> 
> http://docs.basho.com/riak/1.4.9/ops/running/recovery/repairing-partitions/#Running-a-Repair
> 
> Best Regards,
> 
> Bryan Hunt 
> 
> Bryan Hunt - Client Services Engineer - Basho Technologies Limited - Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> 
> On 6 Aug 2014, at 12:53, Effenberg, Simon <seffenberg at team.mobile.de> wrote:
> 
> > Hi Engel,
> > 
> > I tried it yesterday but it was the same:
> > 
> > 2014-08-05 17:53:14.728 UTC [info] <0.24306.9>@riak_kv_2i_aae:repair_partition:257 Acquired lock on partition 548063113999088594326381812268606132370974703616
> > 2014-08-05 17:53:14.728 UTC [info] <0.24306.9>@riak_kv_2i_aae:repair_partition:259 Repairing indexes in partition 548063113999088594326381812268606132370974703616
> > 2014-08-05 17:53:14.753 UTC [info] <0.24306.9>@riak_kv_2i_aae:create_index_data_db:324 Creating temporary database of 2i data in /var/lib/riak/anti_entropy/2i/tmp_db
> > 2014-08-05 17:53:14.772 UTC [info] <0.24306.9>@riak_kv_2i_aae:create_index_data_db:361 Grabbing all index data for partition 548063113999088594326381812268606132370974703616
> > 2014-08-05 17:58:14.773 UTC [info] <0.24305.9>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >        Total partitions: 1
> >        Finished partitions: 1
> >        Speed: 100
> >        Total 2i items scanned: 0
> >        Total tree objects: 0
> >        Total objects fixed: 0
> > With errors:
> > Partition: 548063113999088594326381812268606132370974703616
> > Error: index_scan_timeout
> > 
> > Can't we use some erlang commands to execute parts of this manually to check where the timeout actually happens? Or at least who is timing out?
> > 
> > Cheers
> > Simon
> > 
> > On Tue, Aug 05, 2014 at 10:21:57AM -0400, Engel Sanchez wrote:
> >>   Simon:  The data scan for that partition seems to be taking more than 5
> >>   minutes to collect a batch of 1000 items, so the 2i repair process is
> >>   giving up on it before it has a chance to finish.   You can reduce the
> >>   likelihood of this happening by configuring the batch parameter to
> >>   something small.  In the riak_kv section of the configuration file, set
> >>   this:
> >>   {riak_kv, [
> >>      {aae_2i_batch_size, 10},
> >>      ...
> >>   Let us know if that allows it to finish the repair.  You should still look
> >>   into what may be causing the slowness.  A combination of slow disks or
> >>   very large data sets might do it.
> >> 
> >>   On Fri, Aug 1, 2014 at 5:24 AM, Russell Brown <russell.brown at me.com>
> >>   wrote:
> >> 
> >>     Hi Simon,
> >>     Sorry for the delays. I'm on vacation for a couple of days. Will pick
> >>     this up on Monday.
> >> 
> >>     Cheers
> >>     Russell
> >>     On 1 Aug 2014, at 09:56, Effenberg, Simon <seffenberg at team.mobile.de>
> >>     wrote:
> >> 
> >>> Hi Russell, @basho
> >>> 
> >>> any updates on this? We still have the issues with 2i (repair is also
> >>> still not possible) and searching for the 2i indexes is reproducable
> >>> creating (for one range I tested) 3 different values.
> >>> 
> >>> I would love to provide anything you need to debug that issue.
> >>> 
> >>> Cheers
> >>> Simon
> >>> 
> >>> On Wed, Jul 30, 2014 at 09:22:56AM +0000, Effenberg, Simon wrote:
> >>>> Great. Thanks Russell..
> >>>> 
> >>>> if you need me to do something.. feel free to ask.
> >>>> 
> >>>> Cheers
> >>>> Simon
> >>>> 
> >>>> On Wed, Jul 30, 2014 at 10:19:56AM +0100, Russell Brown wrote:
> >>>>> Thanks Simon,
> >>>>> 
> >>>>> I'm going to spend a some time on this day.
> >>>>> 
> >>>>> Cheers
> >>>>> 
> >>>>> Russell
> >>>>> 
> >>>>> On 30 Jul 2014, at 10:05, Effenberg, Simon
> >>     <seffenberg at team.mobile.de> wrote:
> >>>>> 
> >>>>>> Hi Russel,
> >>>>>> 
> >>>>>> still one machine out of 13 is on wheezy and the rest on squeeze
> >>     but the
> >>>>>> software is the same and basho is providing even the erlang stuff.
> >>     So
> >>>>>> their should no real difference inside the application.
> >>>>>> 
> >>>>>> And the errors are almost the same (except the async_write/read
> >>>>>> difference).
> >>>>>> 
> >>>>>> I paste them:
> >>>>>> 
> >>>>>> ---------- node 1 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:16:07.728 UTC [info]
> >>     <0.14871.336>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 125597796958124469533129165311555572001681702912
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> gen_server
> >>     <0.1525.0> terminated with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.324.211123>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97
> >>>>>> ,116,105,111,110,95,115,101,99,114,...>>,...}], []) in
> >>     eleveldb:write/3 line 155
> >>>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> CRASH REPORT Process
> >>     <0.1525.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.324.211123>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,11
> >>>>>> 
> >>     1,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1517.0> Supervisor
> >>     {<0.1517.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.1525.0> exit with
> >>     reason bad argument in call
> >>>>>> to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in context child_terminated
> >>>>>> 
> >>>>>> 
> >>>>>> ---------- node 2 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:16:07.791 UTC [info]
> >>     <0.8083.314>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 622279994019798508141412682679979879462877528064
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:16:07.791 UTC [error] <0.1884.0> gen_server
> >>     <0.1884.0> terminated with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.318.96628>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,
> >>>>>> 116,105,111,110,95,115,101,99,114,...>>,...}], []) in
> >>     eleveldb:write/3 line 155
> >>>>>> 2014-07-30 06:16:07.791 UTC [error] <0.1884.0> CRASH REPORT Process
> >>     <0.1884.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.318.96628>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111
> >>>>>> 
> >>     ,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:16:07.792 UTC [error] <0.1875.0> Supervisor
> >>     {<0.1875.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.1884.0> exit with
> >>     reason bad argument in call
> >>>>>> to eleveldb:async_write(#Ref<0.0.318.96628>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in context child_terminated
> >>>>>> 
> >>>>>> ---------- node 3 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:17:42.679 UTC [info]
> >>     <0.15746.299>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 291158529312015815735890337767697007822080311296
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:17:42.679 UTC [error] <0.975.0> gen_server <0.975.0>
> >>     terminated with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155
> >>>>>> 2014-07-30 06:17:42.679 UTC [error] <0.975.0> CRASH REPORT Process
> >>     <0.975.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:17:42.679 UTC [error] <0.969.0> Supervisor
> >>     {<0.969.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.975.0> exit with
> >>     reason bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in context child_terminated
> >>>>>> 
> >>>>>> ---------- node 4 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:16:10.004 UTC [info]
> >>     <0.28895.382>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 319703483166135013357056057156686910549735243776
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:16:10.004 UTC [error] <0.1580.0> gen_server
> >>     <0.1580.0> terminated with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.367.155781>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155
> >>>>>> 2014-07-30 06:16:10.004 UTC [error] <0.1580.0> CRASH REPORT Process
> >>     <0.1580.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.367.155781>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:16:10.005 UTC [error] <0.1570.0> Supervisor
> >>     {<0.1570.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.1580.0> exit with
> >>     reason bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.367.155781>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in context child_terminated
> >>>>>> 
> >>>>>> ---------- node 5 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:16:09.191 UTC [info]
> >>     <0.15985.355>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 833512652540280570538039006158505159647524028416
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:16:09.191 UTC [error] <0.1601.0> gen_server
> >>     <0.1601.0> terminated with reason: bad argument in call to
> >>     eleveldb:async_get(#Ref<0.0.351.26505>, <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143
> >>>>>> 2014-07-30 06:16:09.191 UTC [error] <0.1601.0> CRASH REPORT Process
> >>     <0.1601.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_get(#Ref<0.0.351.26505>, <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:16:09.192 UTC [error] <0.1598.0> Supervisor
> >>     {<0.1598.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.1601.0> exit with
> >>     reason bad argument in call to eleveldb:async_get(#Ref<0.0.351.26505>,
> >>     <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143 in context child_terminated
> >>>>>> 
> >>>>>> ---------- node 6 -----------
> >>>>>> 
> >>>>>> 2014-07-30 06:16:09.154 UTC [info]
> >>     <0.32042.379>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>      Total partitions: 1
> >>>>>>      Finished partitions: 1
> >>>>>>      Speed: 100
> >>>>>>      Total 2i items scanned: 0
> >>>>>>      Total tree objects: 0
> >>>>>>      Total objects fixed: 0
> >>>>>> With errors:
> >>>>>> Partition: 34253944624943037145398863266787883273185918976
> >>>>>> Error: index_scan_timeout
> >>>>>> 
> >>>>>> 
> >>>>>> 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> gen_server
> >>     <0.4086.0> terminated with reason: bad argument in call to
> >>     eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143
> >>>>>> 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> CRASH REPORT Process
> >>     <0.4086.0> with 0 neighbours exited with reason: bad argument in call to
> >>     eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747
> >>>>>> 2014-07-30 06:16:09.154 UTC [error] <0.4085.0> Supervisor
> >>     {<0.4085.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.4086.0> exit with
> >>     reason bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>,
> >>     <<>>,
> >>     <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
> >>     []) in eleveldb:get/3 line 143 in context child_terminated
> >>>>>> 
> >>>>>> On Wed, Jul 30, 2014 at 09:50:22AM +0100, Russell Brown wrote:
> >>>>>>> Hi Simon,
> >>>>>>> So the earlier "this is on wheezy, rest are on squeeze" thing is
> >>     no longer a factor?
> >>>>>>> 
> >>>>>>> Any and all 2i repair you do ends with the same error?
> >>>>>>> 
> >>>>>>> Cheers
> >>>>>>> 
> >>>>>>> Russell
> >>>>>>> 
> >>>>>>> On 30 Jul 2014, at 07:29, Effenberg, Simon
> >>     <seffenberg at team.mobile.de> wrote:
> >>>>>>> 
> >>>>>>>> I tried it now with one partition on 6 different machines and
> >>     everywhere the same result: index_scan_timeout and the info: bad
> >>     argument in call to eleveldb:async_get (2x) or async_write (4x).
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> Von Samsung Mobile gesendet
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> -------- Urspru:ngliche Nachricht --------
> >>>>>>>> Von: "Effenberg, Simon"
> >>>>>>>> Datum:30.07.2014 07:49 (GMT+01:00)
> >>>>>>>> An: bryan hunt
> >>>>>>>> Cc: riak-users at lists.basho.com
> >>>>>>>> Betreff: AW: repair-2i stops with "bad argument in call to
> >>     eleveldb:async_write"
> >>>>>>>> 
> >>>>>>>> Hi,
> >>>>>>>> 
> >>>>>>>> I tried it on two different nodes with one partition each. Both
> >>     multiple times before the upgrade and after the upgrade.
> >>>>>>>> 
> >>>>>>>> I will try it on other machines in a minute but because I tried
> >>     it already on two different nodes and one of them is 2 weeks old and
> >>     stored on a HP 3par I bet that this is not a disk corruption issue..
> >>>>>>>> 
> >>>>>>>> Simon
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> Von Samsung Mobile gesendet
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> -------- Urspru:ngliche Nachricht --------
> >>>>>>>> Von: bryan hunt
> >>>>>>>> Datum:29.07.2014 18:21 (GMT+01:00)
> >>>>>>>> An: "Effenberg, Simon"
> >>>>>>>> Cc: riak-users at lists.basho.com
> >>>>>>>> Betreff: Re: repair-2i stops with "bad argument in call to
> >>     eleveldb:async_write"
> >>>>>>>> 
> >>>>>>>> Hi Simon,
> >>>>>>>> 
> >>>>>>>> Does the problem persist if you run it again?
> >>>>>>>> 
> >>>>>>>> Does it happen if you run it against any other partition?
> >>>>>>>> 
> >>>>>>>> Best Regards,
> >>>>>>>> 
> >>>>>>>> Bryan
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> Bryan Hunt - Client Services Engineer - Basho Technologies
> >>     Limited - Registered Office - 8 Lincoln's Inn Fields London WC2A 3BP Reg
> >>     07970431
> >>>>>>>> 
> >>>>>>>> On 29 Jul 2014, at 09:35, Effenberg, Simon
> >>     <seffenberg at team.mobile.de> wrote:
> >>>>>>>> 
> >>>>>>>>> Hi,
> >>>>>>>>> 
> >>>>>>>>> we have some issues with 2i queries like that:
> >>>>>>>>> 
> >>>>>>>>> seffenberg at kriak46-1:~$ while :; do curl -s
> >>     localhost:8098/buckets/conversation/index/createdat_int/0/23182680 |
> >>     ruby -rjson -e "o = JSON.parse(STDIN.read); puts o['keys'].size"; sleep
> >>     1; done
> >>>>>>>>> 
> >>>>>>>>> 13853
> >>>>>>>>> 13853
> >>>>>>>>> 0
> >>>>>>>>> 557
> >>>>>>>>> 557
> >>>>>>>>> 557
> >>>>>>>>> 13853
> >>>>>>>>> 0
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> ...
> >>>>>>>>> 
> >>>>>>>>> So I tried to start a repair-2i first on one vnode/partition on
> >>     one node
> >>>>>>>>> (which is quiet new in the cluster.. 2 weeks or so).
> >>>>>>>>> 
> >>>>>>>>> The command is failing with the following log entries:
> >>>>>>>>> 
> >>>>>>>>> seffenberg at kriak46-7:~$ sudo riak-admin repair-2i
> >>     22835963083295358096932575511191922182123945984
> >>>>>>>>> Will repair 2i on these partitions:
> >>>>>>>>>     22835963083295358096932575511191922182123945984
> >>>>>>>>> Watch the logs for 2i repair progress reports
> >>>>>>>>> seffenberg at kriak46-7:~$ 2014-07-29 08:20:22.729 UTC [info]
> >>     <0.5929.1061>@riak_kv_2i_aae:init:139 Starting 2i repair at speed 100
> >>     for partitions [22835963083295358096932575511191922182123945984]
> >>>>>>>>> 2014-07-29 08:20:22.729 UTC [info]
> >>     <0.5930.1061>@riak_kv_2i_aae:repair_partition:257 Acquired lock on
> >>     partition 22835963083295358096932575511191922182123945984
> >>>>>>>>> 2014-07-29 08:20:22.729 UTC [info]
> >>     <0.5930.1061>@riak_kv_2i_aae:repair_partition:259 Repairing indexes in
> >>     partition 22835963083295358096932575511191922182123945984
> >>>>>>>>> 2014-07-29 08:20:22.740 UTC [info]
> >>     <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:324 Creating temporary
> >>     database of 2i data in /var/lib/riak/anti_entropy/2i/tmp_db
> >>>>>>>>> 2014-07-29 08:20:22.751 UTC [info]
> >>     <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:361 Grabbing all index
> >>     data for partition 22835963083295358096932575511191922182123945984
> >>>>>>>>> 2014-07-29 08:25:22.752 UTC [info]
> >>     <0.5929.1061>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >>>>>>>>>     Total partitions: 1
> >>>>>>>>>     Finished partitions: 1
> >>>>>>>>>     Speed: 100
> >>>>>>>>>     Total 2i items scanned: 0
> >>>>>>>>>     Total tree objects: 0
> >>>>>>>>>     Total objects fixed: 0
> >>>>>>>>> With errors:
> >>>>>>>>> Partition: 22835963083295358096932575511191922182123945984
> >>>>>>>>> Error: index_scan_timeout
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> 2014-07-29 08:25:22.752 UTC [error] <0.4711.1061> gen_server
> >>     <0.4711.1061> terminated with reason: bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155
> >>>>>>>>> 2014-07-29 08:25:22.753 UTC [error] <0.4711.1061> CRASH REPORT
> >>     Process <0.4711.1061> with 0 neighbours exited with reason: bad argument
> >>     in call to eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> >>>>>>>>> 2014-07-29 08:25:22.753 UTC [error] <0.1031.0> Supervisor
> >>     {<0.1031.0>,poolboy_sup} had child riak_core_vnode_worker started with
> >>     {riak_core_vnode_worker,start_link,undefined} at <0.4711.1061> exit with
> >>     reason bad argument in call to
> >>     eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>,
> >>     [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >>     []) in eleveldb:write/3 line 155 in context child_terminated
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> Anything I can do about that? What's the issue here?
> >>>>>>>>> 
> >>>>>>>>> I'm using Riak 1.4.8 (.deb package).
> >>>>>>>>> 
> >>>>>>>>> Cheers
> >>>>>>>>> Simon
> >>>>>>>>> _______________________________________________
> >>>>>>>>> riak-users mailing list
> >>>>>>>>> riak-users at lists.basho.com
> >>>>>>>>> 
> >>     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>>>> 
> >>>>>>>> _______________________________________________
> >>>>>>>> riak-users mailing list
> >>>>>>>> riak-users at lists.basho.com
> >>>>>>>> 
> >>     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>>> 
> >>>>>> 
> >>>>>> --
> >>>>>> Simon Effenberg | Site Op | mobile.international GmbH
> >>>>>> 
> >>>>>> Phone:    + 49. 30. 8109. 7173
> >>>>>> M-Phone:  + 49. 151. 5266. 1558
> >>>>>> Mail:     seffenberg at team.mobile.de
> >>>>>> Web:      www.mobile.de
> >>>>>> 
> >>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> >>>>>> 
> >>>>>> ______________________________________________________
> >>>>>> Gescha:ftsfu:hrer: Malte Kru:ger
> >>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam
> >>>>>> Sitz der Gesellschaft: Kleinmachnow
> >>>>>> ______________________________________________________
> >>>>>> _______________________________________________
> >>>>>> riak-users mailing list
> >>>>>> riak-users at lists.basho.com
> >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>> 
> >>>> 
> >>>> --
> >>>> Simon Effenberg | Site Op | mobile.international GmbH
> >>>> 
> >>>> Phone:    + 49. 30. 8109. 7173
> >>>> M-Phone:  + 49. 151. 5266. 1558
> >>>> Mail:     seffenberg at team.mobile.de
> >>>> Web:      www.mobile.de
> >>>> 
> >>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> >>>> 
> >>>> ______________________________________________________
> >>>> Gescha:ftsfu:hrer: Malte Kru:ger
> >>>> HRB Nr.: 18517 P, Amtsgericht Potsdam
> >>>> Sitz der Gesellschaft: Kleinmachnow
> >>>> ______________________________________________________
> >>>> _______________________________________________
> >>>> riak-users mailing list
> >>>> riak-users at lists.basho.com
> >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>> 
> >>> --
> >>> Simon Effenberg | Site Op | mobile.international GmbH
> >>> 
> >>> Phone:    + 49. 30. 8109. 7173
> >>> M-Phone:  + 49. 151. 5266. 1558
> >>> Mail:     seffenberg at team.mobile.de
> >>> Web:      www.mobile.de
> >>> 
> >>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> >>> 
> >>> ______________________________________________________
> >>> Gescha:ftsfu:hrer: Malte Kru:ger
> >>> HRB Nr.: 18517 P, Amtsgericht Potsdam
> >>> Sitz der Gesellschaft: Kleinmachnow
> >>> ______________________________________________________
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> riak-users at lists.basho.com
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >> 
> >>     _______________________________________________
> >>     riak-users mailing list
> >>     riak-users at lists.basho.com
> >>     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > -- 
> > Simon Effenberg | Site Op | mobile.international GmbH
> > 
> > Phone:    + 49. 30. 8109. 7173
> > M-Phone:  + 49. 151. 5266. 1558
> > Mail:     seffenberg at team.mobile.de
> > Web:      www.mobile.de
> > 
> > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > 
> > ______________________________________________________
> > Geschäftsführer: Malte Krüger
> > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > Sitz der Gesellschaft: Kleinmachnow
> > ______________________________________________________
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 

-- 
Simon Effenberg | Site Op | mobile.international GmbH

Phone:    + 49. 30. 8109. 7173
M-Phone:  + 49. 151. 5266. 1558
Mail:     seffenberg at team.mobile.de
Web:      www.mobile.de

Marktplatz 1 | 14532 Europarc Dreilinden | Germany

______________________________________________________
Geschäftsführer: Malte Krüger
HRB Nr.: 18517 P, Amtsgericht Potsdam
Sitz der Gesellschaft: Kleinmachnow
______________________________________________________


More information about the riak-users mailing list