Crashed node has Bitcask merge errors on restart

Jeff Pollard jeff.pollard at gmail.com
Fri Aug 5 11:00:16 EDT 2011


Hey David,

Thanks again for all your help, much appreciated.  We've since restored from
our backup and the node appears healthy now.  I checked the logs and saw
successful bitcask merges and read repairs were happening, so all appears
well.

I also upped the ulimit -n value based on your observation.

On Fri, Aug 5, 2011 at 7:07 AM, David Smith <dizzyd at basho.com> wrote:

> On Fri, Aug 5, 2011 at 6:49 AM, Jeff Pollard <jeff.pollard at gmail.com>
> wrote:
> > Update: now the node has crashed, due to the following lines in the
> > sasl-error.log (see below).  I've also attached the crash dump to this
> > email.
> > Real quickly though, just to confirm - If we wanted to restore the node
> from
> > a recent backup, the procedure is as simple as:
> >
> > Stop the node.
> > Restore the bitcask and ring directories from a recent backup (~12 hours
> > old) to the node
> > Start the node
> >
> > That correct?  Any gotchas or anything else I should know about that
> > process?
>
> You really only need to do the bitcask dirs, not the ring; also make
> sure you move aside the bitcask files created interstitially.
>
> The {error, emfile} problem shows that you're hitting the ulimit -n
> for your process. Make sure your ulimit is set appropriately. It can
> also be helpful to remove any 0-length *.bitcask.data files.
>
> D.
>
> --
> Dave Smith
> Director, Engineering
> Basho Technologies, Inc.
> dizzyd at basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110805/d3e2212e/attachment.html>


More information about the riak-users mailing list