Backing up riak
mark at basho.com
Wed Apr 25 13:31:13 EDT 2012
On Tue, Apr 24, 2012 at 9:31 AM, Swinney, Austin <Austin at vimeo.com> wrote:
> I'm using leveldb on a 5 node ec2 cluster. I've read about the basic tar
> zipping of leveldb and ring files per node. Also, I read about riak-admin
> backup all.
> I'm curious about a couple things.
> 1) is the `riak-admin backup all` command non-blocking?
Assuming you mean, "Will the cluster continue to work when I use this
command?", then "Yes". :)
> 2) is it a consistent backup or is consistency old fashioned thinking?
The backup will be complete up to the point at which it was taken. You'll
get a dump of all the keys in the order in which they were listed. Updates
that happened during the backup may or may not be captured. (I would have
to verify exactly how you would know which made it and which didn't.)
> 3) if you use the tar'ing up of leveldb + ring files per node, you lose
> one node, then you restore it from this tar file that is hours or days old,
> how does riak deal with bringing its data up to date?
After you restored the node, it would gradually sync its replicas with
those on the other nodes via read/repair. That said, doing a complete
restore of the node would probably not be needed. When the node
disappears, Riak will compensate for it by sending its writes/reads to
fallback nodes. When it comes back online, hinted handoff and read
repair will make sure it gets all the replicas it was supposed to have
and that those replicas were up to date. (You will have to force Read
Repar on the replicas on that node which can be done via a list keys
or using an existing snippet of code  for doing this but be warned
that it'll put some load on that node. We're working on making the
Read Repair process less reactive in future releases, but this is the
best way to do it right now.) To be clear, I'm in no way advocating
not backing-up your data. You just might not need to use them in this
Another thing worth noting - the 'riak-admin backup' command is not
known to be the speediest. If you have any non-trivial amount of data
that needs backing up, you're probably best to do a FS snapshot of
Level on each node. Unfortunately doing a live snapshot of Level is
less than bulletproof at the moment, so you're advised to stop the
node, snapshot level, and restart. You'll have to take the node
offline for this but with five Riak nodes, your cluster should Just
Hope that helps.
 Fair warning: I'm not sure the last time this was tested -
> riak-users mailing list
> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users