repair-2i stops with "bad argument in call to eleveldb:async_write"

Effenberg, Simon seffenberg at team.mobile.de
Wed Jul 30 05:05:51 EDT 2014


Hi Russel,

still one machine out of 13 is on wheezy and the rest on squeeze but the
software is the same and basho is providing even the erlang stuff. So
their should no real difference inside the application.

And the errors are almost the same (except the async_write/read
difference).

I paste them:

---------- node 1 -----------

2014-07-30 06:16:07.728 UTC [info] <0.14871.336>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 125597796958124469533129165311555572001681702912
Error: index_scan_timeout


2014-07-30 06:16:07.728 UTC [error] <0.1525.0> gen_server <0.1525.0> terminated with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97
,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155
2014-07-30 06:16:07.728 UTC [error] <0.1525.0> CRASH REPORT Process <0.1525.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,11
1,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
2014-07-30 06:16:07.728 UTC [error] <0.1517.0> Supervisor {<0.1517.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1525.0> exit with reason bad argument in call
 to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in context child_terminated


---------- node 2 -----------

2014-07-30 06:16:07.791 UTC [info] <0.8083.314>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 622279994019798508141412682679979879462877528064
Error: index_scan_timeout


2014-07-30 06:16:07.791 UTC [error] <0.1884.0> gen_server <0.1884.0> terminated with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,
116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155
2014-07-30 06:16:07.791 UTC [error] <0.1884.0> CRASH REPORT Process <0.1884.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111
,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
2014-07-30 06:16:07.792 UTC [error] <0.1875.0> Supervisor {<0.1875.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1884.0> exit with reason bad argument in call
 to eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in context child_terminated

---------- node 3 -----------

2014-07-30 06:17:42.679 UTC [info] <0.15746.299>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 291158529312015815735890337767697007822080311296
Error: index_scan_timeout


2014-07-30 06:17:42.679 UTC [error] <0.975.0> gen_server <0.975.0> terminated with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155
2014-07-30 06:17:42.679 UTC [error] <0.975.0> CRASH REPORT Process <0.975.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
2014-07-30 06:17:42.679 UTC [error] <0.969.0> Supervisor {<0.969.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.975.0> exit with reason bad argument in call to eleveldb:async_write(#Ref<0.0.2075.159423>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in context child_terminated

---------- node 4 -----------

2014-07-30 06:16:10.004 UTC [info] <0.28895.382>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 319703483166135013357056057156686910549735243776
Error: index_scan_timeout


2014-07-30 06:16:10.004 UTC [error] <0.1580.0> gen_server <0.1580.0> terminated with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.367.155781>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155
2014-07-30 06:16:10.004 UTC [error] <0.1580.0> CRASH REPORT Process <0.1580.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.367.155781>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
2014-07-30 06:16:10.005 UTC [error] <0.1570.0> Supervisor {<0.1570.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1580.0> exit with reason bad argument in call to eleveldb:async_write(#Ref<0.0.367.155781>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in context child_terminated

---------- node 5 -----------

2014-07-30 06:16:09.191 UTC [info] <0.15985.355>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 833512652540280570538039006158505159647524028416
Error: index_scan_timeout


2014-07-30 06:16:09.191 UTC [error] <0.1601.0> gen_server <0.1601.0> terminated with reason: bad argument in call to eleveldb:async_get(#Ref<0.0.351.26505>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143
2014-07-30 06:16:09.191 UTC [error] <0.1601.0> CRASH REPORT Process <0.1601.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_get(#Ref<0.0.351.26505>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747
2014-07-30 06:16:09.192 UTC [error] <0.1598.0> Supervisor {<0.1598.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.1601.0> exit with reason bad argument in call to eleveldb:async_get(#Ref<0.0.351.26505>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143 in context child_terminated

---------- node 6 -----------

2014-07-30 06:16:09.154 UTC [info] <0.32042.379>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
        Total partitions: 1
        Finished partitions: 1
        Speed: 100
        Total 2i items scanned: 0
        Total tree objects: 0
        Total objects fixed: 0
With errors:
Partition: 34253944624943037145398863266787883273185918976
Error: index_scan_timeout


2014-07-30 06:16:09.154 UTC [error] <0.4086.0> gen_server <0.4086.0> terminated with reason: bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143
2014-07-30 06:16:09.154 UTC [error] <0.4086.0> CRASH REPORT Process <0.4086.0> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747
2014-07-30 06:16:09.154 UTC [error] <0.4085.0> Supervisor {<0.4085.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.4086.0> exit with reason bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>, []) in eleveldb:get/3 line 143 in context child_terminated

On Wed, Jul 30, 2014 at 09:50:22AM +0100, Russell Brown wrote:
> Hi Simon, 
> So the earlier “this is on wheezy, rest are on squeeze” thing is no longer a factor?
> 
> Any and all 2i repair you do ends with the same error?
> 
> Cheers
> 
> Russell
> 
> On 30 Jul 2014, at 07:29, Effenberg, Simon <seffenberg at team.mobile.de> wrote:
> 
> > I tried it now with one partition on 6 different machines and everywhere the same result: index_scan_timeout and the info: bad argument in call to eleveldb:async_get (2x) or async_write (4x).
> > 
> > 
> > Von Samsung Mobile gesendet
> > 
> > 
> > -------- Ursprüngliche Nachricht --------
> > Von: "Effenberg, Simon"
> > Datum:30.07.2014 07:49 (GMT+01:00)
> > An: bryan hunt
> > Cc: riak-users at lists.basho.com
> > Betreff: AW: repair-2i stops with "bad argument in call to eleveldb:async_write"
> > 
> > Hi,
> > 
> >  I tried it on two different nodes with one partition each. Both multiple times before the upgrade and after the upgrade.
> > 
> > I will try it on other machines in a minute but because I tried it already on two different nodes and one of them is 2 weeks old and stored on a HP 3par I bet that this is not a disk corruption issue..
> > 
> > Simon
> > 
> > 
> > Von Samsung Mobile gesendet
> > 
> > 
> > -------- Ursprüngliche Nachricht --------
> > Von: bryan hunt
> > Datum:29.07.2014 18:21 (GMT+01:00)
> > An: "Effenberg, Simon"
> > Cc: riak-users at lists.basho.com
> > Betreff: Re: repair-2i stops with "bad argument in call to eleveldb:async_write"
> > 
> > Hi Simon,
> > 
> > Does the problem persist if you run it again? 
> > 
> > Does it happen if you run it against any other partition?
> > 
> > Best Regards,
> > 
> > Bryan
> > 
> > 
> > 
> > Bryan Hunt - Client Services Engineer - Basho Technologies Limited - Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> > 
> > On 29 Jul 2014, at 09:35, Effenberg, Simon <seffenberg at team.mobile.de> wrote:
> > 
> > > Hi,
> > > 
> > > we have some issues with 2i queries like that:
> > > 
> > > seffenberg at kriak46-1:~$ while :; do curl -s localhost:8098/buckets/conversation/index/createdat_int/0/23182680 | ruby -rjson -e "o = JSON.parse(STDIN.read); puts o['keys'].size"; sleep 1; done
> > > 
> > > 13853
> > > 13853
> > > 0
> > > 557
> > > 557
> > > 557
> > > 13853
> > > 0
> > > 
> > > 
> > > ...
> > > 
> > > So I tried to start a repair-2i first on one vnode/partition on one node
> > > (which is quiet new in the cluster.. 2 weeks or so).
> > > 
> > > The command is failing with the following log entries:
> > > 
> > > seffenberg at kriak46-7:~$ sudo riak-admin repair-2i 22835963083295358096932575511191922182123945984
> > > Will repair 2i on these partitions:
> > >        22835963083295358096932575511191922182123945984
> > > Watch the logs for 2i repair progress reports
> > > seffenberg at kriak46-7:~$ 2014-07-29 08:20:22.729 UTC [info] <0.5929.1061>@riak_kv_2i_aae:init:139 Starting 2i repair at speed 100 for partitions [22835963083295358096932575511191922182123945984]
> > > 2014-07-29 08:20:22.729 UTC [info] <0.5930.1061>@riak_kv_2i_aae:repair_partition:257 Acquired lock on partition 22835963083295358096932575511191922182123945984
> > > 2014-07-29 08:20:22.729 UTC [info] <0.5930.1061>@riak_kv_2i_aae:repair_partition:259 Repairing indexes in partition 22835963083295358096932575511191922182123945984
> > > 2014-07-29 08:20:22.740 UTC [info] <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:324 Creating temporary database of 2i data in /var/lib/riak/anti_entropy/2i/tmp_db
> > > 2014-07-29 08:20:22.751 UTC [info] <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:361 Grabbing all index data for partition 22835963083295358096932575511191922182123945984
> > > 2014-07-29 08:25:22.752 UTC [info] <0.5929.1061>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> > >        Total partitions: 1
> > >        Finished partitions: 1
> > >        Speed: 100
> > >        Total 2i items scanned: 0
> > >        Total tree objects: 0
> > >        Total objects fixed: 0
> > > With errors:
> > > Partition: 22835963083295358096932575511191922182123945984
> > > Error: index_scan_timeout
> > > 
> > > 
> > > 2014-07-29 08:25:22.752 UTC [error] <0.4711.1061> gen_server <0.4711.1061> terminated with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155
> > > 2014-07-29 08:25:22.753 UTC [error] <0.4711.1061> CRASH REPORT Process <0.4711.1061> with 0 neighbours exited with reason: bad argument in call to eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> > > 2014-07-29 08:25:22.753 UTC [error] <0.1031.0> Supervisor {<0.1031.0>,poolboy_sup} had child riak_core_vnode_worker started with {riak_core_vnode_worker,start_link,undefined} at <0.4711.1061> exit with reason bad argument in call to eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 line 155 in context child_terminated
> > > 
> > > 
> > > Anything I can do about that? What's the issue here?
> > > 
> > > I'm using Riak 1.4.8 (.deb package).
> > > 
> > > Cheers
> > > Simon
> > > _______________________________________________
> > > riak-users mailing list
> > > riak-users at lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 

-- 
Simon Effenberg | Site Op | mobile.international GmbH

Phone:    + 49. 30. 8109. 7173
M-Phone:  + 49. 151. 5266. 1558
Mail:     seffenberg at team.mobile.de
Web:      www.mobile.de

Marktplatz 1 | 14532 Europarc Dreilinden | Germany

______________________________________________________
Geschäftsführer: Malte Krüger
HRB Nr.: 18517 P, Amtsgericht Potsdam
Sitz der Gesellschaft: Kleinmachnow
______________________________________________________


More information about the riak-users mailing list