riak not starting properly

Richard Heycock rgh at topikality.com
Fri Oct 29 00:25:36 EDT 2010


Hi Grant,

Sorry for taking so taking so long getting back to you I had to go
overseas at short notice.

Excerpts from Grant Schofield's message of 2010-09-18 02:13:04 +1000:
> There looks like there are some strange things happening in the logs with the ring as well as timeouts. I would be curious how long the process takes to die if you were just to run a stop instead of a restart.  Did you at one time change the name of this node? 

It takes about 4.5 seconds (wall time).

I haven't changed the name of the node and the app.config is as per the
Debian package.

I should also that there is only one node.

> One thing that might be interesting to try is to stop the server, make a copy of your data directory, remove all the data in the data directory, and try to start and stop the node and see if it works more reliably.

I tried removing all the data from the data directory as you suggested
and for a while it worked but the problem has started again.

The bitcask directory is 1.5G in size and the ring directory is 24K. The
bucket properties are:

    {
      "props": {
        "name": "uris",
        "n_val": 3,
        "allow_mult": false,
        "last_write_wins": false,
        "precommit": [

        ],
        "postcommit": [

        ],
        "chash_keyfun": {
          "mod": "riak_core_util",
          "fun": "chash_std_keyfun"
        },
        "linkfun": {
          "mod": "riak_kv_wm_link_walker",
          "fun": "mapreduce_linkfun"
        },
        "old_vclock": 86400,
        "young_vclock": 20,
        "big_vclock": 50,
        "small_vclock": 10,
        "r": "quorum",
        "w": "quorum",
        "dw": "quorum",
        "rw": "quorum"
      }
    }

I also tried to get the properties with the keys=true option but after
10 minutes of no activity (the 1, 5 and 10 minute load averages were
all zero) I killed the process. The only indication of any activity was
the following log message every minute:

    ERROR REPORT==== 29-Oct-2010::04:19:43 ===
    ** Generic server riak_kv_vnode_master terminating 
    ** Last message in was {'$gen_cast',
                               {riak_vnode_req_v1,
                                   1141798154164767904846628775559596109106197299200,
                                   ignore,
                                   {riak_kv_listkeys_req_v2,<<"uris">>,92166134,
                                       <0.2635.0>}}}
    ** When Server state == {state,679956,[],undefined,riak_kv_vnode,
                                   riak_kv_legacy_vnode}
    ** Reason for termination == 
    ** {{badmatch,{error,{{badmatch,{error,emfile}},
                          [{bitcask,scan_key_files,3},
                           {bitcask,open,2},
                           {riak_kv_bitcask_backend,start,2},
                           {riak_kv_vnode,init,1},
                           {riak_core_vnode,init,1},
                           {gen_fsm,init_it,6},
                           {proc_lib,init_p_do_apply,3}]}}},
        [{riak_core_vnode_master,get_vnode,2},
         {riak_core_vnode_master,handle_cast,2},
         {gen_server,handle_msg,5},
         {proc_lib,init_p_do_apply,3}]}


The logs can be found here:

    http://stuff.roughage.com.au/riak-failure-3.log.tar.gz

rgh


> Grant Schofield
> Developer Advocate
> Basho Technologies
> 
> On Sep 16, 2010, at 4:53 PM, Richard Heycock wrote:
> 
> > Over the last few weeks I've been finding it harder and harder to start
> > riak which given that it's running on an auto-provisioned ec2 instance is
> > a bit of an issue! I can generally restart it by running
> > /etc/init.d/riak restart but it's got to the stage where I have to run
> > it four or five times. I should clarify here that when I say "harder to
> > start" it does start but as soon as I try to do anything it fails.
> > 
> > The contents of /var/log/riak are here:
> > 
> >    http://stuff.roughage.com.au/riak-failure-2.log.tar.gz
> > 
> > rgh
> > -- 
> > Richard Heycock
> > 
> > http://topikality.com
> > 
> > +61 (0) 410 646 369
> > [e]:  rgh at topikality.com
> > [im]: rgh at topikality.com
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
-- 
Richard Heycock

http://topikality.com

+61 (0) 410 646 369
[e]:  rgh at topikality.com
[im]: rgh at topikality.com




More information about the riak-users mailing list