riak not starting properly

Grant Schofield grant at basho.com
Fri Oct 29 10:22:39 EDT 2010


This looks like a ulimit issue, can you try increasing it with the following command: ulimit -n 2056 
Are you on OS X or Linux?

Grant


On Oct 28, 2010, at 11:25 PM, Richard Heycock wrote:

> Hi Grant,
> 
> Sorry for taking so taking so long getting back to you I had to go
> overseas at short notice.
> 
> Excerpts from Grant Schofield's message of 2010-09-18 02:13:04 +1000:
>> There looks like there are some strange things happening in the logs with the ring as well as timeouts. I would be curious how long the process takes to die if you were just to run a stop instead of a restart.  Did you at one time change the name of this node? 
> 
> It takes about 4.5 seconds (wall time).
> 
> I haven't changed the name of the node and the app.config is as per the
> Debian package.
> 
> I should also that there is only one node.
> 
>> One thing that might be interesting to try is to stop the server, make a copy of your data directory, remove all the data in the data directory, and try to start and stop the node and see if it works more reliably.
> 
> I tried removing all the data from the data directory as you suggested
> and for a while it worked but the problem has started again.
> 
> The bitcask directory is 1.5G in size and the ring directory is 24K. The
> bucket properties are:
> 
>    {
>      "props": {
>        "name": "uris",
>        "n_val": 3,
>        "allow_mult": false,
>        "last_write_wins": false,
>        "precommit": [
> 
>        ],
>        "postcommit": [
> 
>        ],
>        "chash_keyfun": {
>          "mod": "riak_core_util",
>          "fun": "chash_std_keyfun"
>        },
>        "linkfun": {
>          "mod": "riak_kv_wm_link_walker",
>          "fun": "mapreduce_linkfun"
>        },
>        "old_vclock": 86400,
>        "young_vclock": 20,
>        "big_vclock": 50,
>        "small_vclock": 10,
>        "r": "quorum",
>        "w": "quorum",
>        "dw": "quorum",
>        "rw": "quorum"
>      }
>    }
> 
> I also tried to get the properties with the keys=true option but after
> 10 minutes of no activity (the 1, 5 and 10 minute load averages were
> all zero) I killed the process. The only indication of any activity was
> the following log message every minute:
> 
>    ERROR REPORT==== 29-Oct-2010::04:19:43 ===
>    ** Generic server riak_kv_vnode_master terminating 
>    ** Last message in was {'$gen_cast',
>                               {riak_vnode_req_v1,
>                                   1141798154164767904846628775559596109106197299200,
>                                   ignore,
>                                   {riak_kv_listkeys_req_v2,<<"uris">>,92166134,
>                                       <0.2635.0>}}}
>    ** When Server state == {state,679956,[],undefined,riak_kv_vnode,
>                                   riak_kv_legacy_vnode}
>    ** Reason for termination == 
>    ** {{badmatch,{error,{{badmatch,{error,emfile}},
>                          [{bitcask,scan_key_files,3},
>                           {bitcask,open,2},
>                           {riak_kv_bitcask_backend,start,2},
>                           {riak_kv_vnode,init,1},
>                           {riak_core_vnode,init,1},
>                           {gen_fsm,init_it,6},
>                           {proc_lib,init_p_do_apply,3}]}}},
>        [{riak_core_vnode_master,get_vnode,2},
>         {riak_core_vnode_master,handle_cast,2},
>         {gen_server,handle_msg,5},
>         {proc_lib,init_p_do_apply,3}]}
> 
> 
> The logs can be found here:
> 
>    http://stuff.roughage.com.au/riak-failure-3.log.tar.gz
> 
> rgh
> 
> 
>> Grant Schofield
>> Developer Advocate
>> Basho Technologies
>> 
>> On Sep 16, 2010, at 4:53 PM, Richard Heycock wrote:
>> 
>>> Over the last few weeks I've been finding it harder and harder to start
>>> riak which given that it's running on an auto-provisioned ec2 instance is
>>> a bit of an issue! I can generally restart it by running
>>> /etc/init.d/riak restart but it's got to the stage where I have to run
>>> it four or five times. I should clarify here that when I say "harder to
>>> start" it does start but as soon as I try to do anything it fails.
>>> 
>>> The contents of /var/log/riak are here:
>>> 
>>>   http://stuff.roughage.com.au/riak-failure-2.log.tar.gz
>>> 
>>> rgh
>>> -- 
>>> Richard Heycock
>>> 
>>> http://topikality.com
>>> 
>>> +61 (0) 410 646 369
>>> [e]:  rgh at topikality.com
>>> [im]: rgh at topikality.com
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> -- 
> Richard Heycock
> 
> http://topikality.com
> 
> +61 (0) 410 646 369
> [e]:  rgh at topikality.com
> [im]: rgh at topikality.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list