Error folding keys - incomplete_hint

Toby Corkindale toby at dryft.net
Wed Mar 29 19:49:50 EDT 2017


I thought I'd follow up on this again, a long time later.
We gave up on the Dockerised version of Riak.
But I notice we're getting an awful lot of these incomplete_hint errors on
the regular, non-docker, cluster now.

We had a sudden power failure in that server room recently, so there would
have been unclean Riak shutdowns. I guessed those were the cause of the
issues with the Docker version, years ago, so I'm wondering if the same
thing happened here?

Should Riak be better at recovering from rough shutdowns? Or is this
another issue altogether?

-Toby

On Fri, 23 Oct 2015 at 11:10 Toby Corkindale <toby at dryft.net> wrote:

> Quick follow-up: As a bit of a hack, deleting all the .hint files prior to
> each start-up does resolve the errors, and immediately results in a whole
> lot of Bitcask merges happening.
> But that doesn't strike me as a good long-term fix.
>
> On Fri, 23 Oct 2015 at 10:52 Toby Corkindale <toby at dryft.net> wrote:
>
> Hi Hector,
> You can see the Dockerfile here:
> https://gist.github.com/TJC/cb3184705bc0eacde885
>
> It's a work in progress, but also, not that involved.
>
> Ubuntu 14.04 is used as both the docker host, and the docker container.
> It's on the btrfs storage driver. (I've had too many issues with the other
> two)
> The Riak data directory is a volume, and is mounted to an external,
> persistent location. (Which is also btrfs)
>
> I suspect there's an issue around Riak shutting down uncleanly when the
> docker container is stopped.
> I have already had to add this to the start-up each time:
> find /var/lib/riak -name "bitcask.*.lock" -delete
>
> So it's clear that Riak is getting killed rather than shutting down
> cleanly; but even so, I'd hope that Riak would cope with that, rather than
> getting into a permanent state of throwing errors.
>
> Toby
>
>
> On Fri, 23 Oct 2015 at 00:01 Hector Castro <hectcastro at gmail.com> wrote:
>
> Can't say I've paid enough attention to the logs in my single-machine
> Riak within Docker setups to confirm.
>
> Do you have the container image definitions somewhere public? That may
> help someone reproduce the issue. Also, did you ensure that the Riak
> data directory is setup as a Docker volume?
>
> Other things that come to mind:
>
> - What OS is the Docker host running?
> - What storage driver are you using for Docker?
> - What file system is the Docker data directory using?
>
> --
> Hector
>
>
> On Thu, Oct 22, 2015 at 2:27 AM, Toby Corkindale <toby at dryft.net> wrote:
> > Anyone?
> >
> > I note that after 24 hours (on a very lightly loaded test cluster) I'm
> still
> > seeing these scroll by a lot - 600 an hour per node.
> > Really curious to know if this is expected behaviour or if this is
> resulting
> > from some kind of node corruption.
> >
> > Cheers
> > Toby
> >
> >
> >
> > On Wed, 21 Oct 2015 at 12:23 Toby Corkindale <toby at dryft.net> wrote:
> >>
> >> Hi,
> >> I've been working on getting Riak to run inside Docker containers - in a
> >> multi-machine cluster. (Previous work I've seen has only run Riak as a
> >> cluster all on the same machine.)
> >> I thought I had it cracked, although I tripped up on the existing issue
> >> with Riak and lockfiles[1]. But the nodes have been generating an awful
> lot
> >> of errors like the below, and I wondered if anyone here can give me an
> >> explanation? (And, is it a problem?)
> >>
> >> 2015-10-21 01:19:23.567 [error] <0.24495.0> Error folding keys for
> >> "/var/lib/riak/bitcask.1h/2283596
> >> 30832953580969325755111919221821239459840/2.bitcask.data":
> >> {incomplete_hint,4}
> >>
> >> 1: Related issues to the lockfiles --
> >> I note that many are closed, but the problem still exists, and is
> >> particularly triggered by using Docker and stopping/killing Riak more
> >> violently than it likes.
> >> https://github.com/basho/bitcask/issues/163 (closed)
> >> https://github.com/basho/riak/issues/535 (open)
> >> https://github.com/basho/bitcask/issues/167 (closed)
> >> https://github.com/basho/bitcask/issues/99 (closed)
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20170329/f3c279cc/attachment-0002.html>


More information about the riak-users mailing list