Spikes in node_(get|put)_fsm_time_100

Anthony Molinaro anthonym at alumni.caltech.edu
Wed Feb 9 15:29:09 EST 2011


Is there any sort of statistic or log of when this compaction occurs?

-Anthony

On Wed, Feb 09, 2011 at 12:22:53PM -0500, Alexander Sicular wrote:
> not quite sure about the overall problem, but bitcask does do compaction if your "casks" reach a certain threshold of dead bytes. this is all configurable and would only be triggered if your updates or deletes pushed your dead bytes over that threshold.
> 
> -alexander
> 
> On Feb 9, 2011, at 12:12 PM, Anthony Molinaro wrote:
> 
> > Hi,
> > 
> >  Any thoughts on this?  I added a timeout to my client so its not impacted
> > (other than missing some data, but that's okay).  However, I still see large
> > spikes in the node_(get|put)_fsm_time_100 stats (normal operation seems to be
> > about 1200, and I see spikes up to 200000).
> > 
> >  One thing I thought of is upping the number of async threads.  I did
> > increase the number of partitions to 1024 and with only 4 nodes in the
> > ring I could be hitting some sort of locking at the bitcask layer.
> > 
> >  Are there any maintenance tasks that happen with bitcask that could
> > cause lag?  For instance in our frequency server which uses riak_core
> > with a linked in driver for a backend, we have to grow the file every
> > so often which lead to these sort of spikes, so maybe bitcask has some
> > thing similar?
> > 
> > Thanks,
> > 
> > -Anthony
> > 
> > On Tue, Feb 08, 2011 at 12:09:34PM -0800, Anthony Molinaro wrote:
> >> Hi,
> >> 
> >>  I have a 4 node cluster using riak_kv_multi_backend with one backend
> >> configured to use riak_kv_bitcask_backend.  I'm using the multi backend
> >> because eventually I want to also run a cache backend.  I'm sampling
> >> the statistics once per minute and viewing them in rrd and noticed
> >> something odd.  The node_(get|put)_fsm_time_100 sometimes spike to
> >> 60 seconds while 99.99% of the time it's less than 2 milliseconds.
> >> 
> >> I'm going to work around by lowering the timeouts in riak-erlang-client
> >> but this seems like it could continue to be a problem if the get/put
> >> fsms continue to run even if the client times out.
> >> 
> >> Anyway, just curious if others have experienced this sort of long tail
> >> spikiness.
> >> 
> >> -Anthony
> >> 
> >> -- 
> >> ------------------------------------------------------------------------
> >> Anthony Molinaro                           <anthonym at alumni.caltech.edu>
> >> 
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > -- 
> > ------------------------------------------------------------------------
> > Anthony Molinaro                           <anthonym at alumni.caltech.edu>
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym at alumni.caltech.edu>




More information about the riak-users mailing list