Help with handling Riak disk failure

Bryan Hunt bryan.hunt at
Tue Sep 19 13:41:50 EDT 2017

(0) Three nodes are insufficient, you should have 5 nodes
(1) You could iterate and read every object in the cluster - this would also trigger read repair for every object
(2) - copied from Engel Sanchez response to a similar question  April 10th 2014 )
* If AAE is disabled, you don't have to stop the node to delete the data in
the anti_entropy directories
* If AAE is enabled, deleting the AAE data in a rolling manner may trigger
an avalanche of read repairs between nodes with the bad trees and nodes
with good trees as the data seems to diverge.

If your nodes are already up, with AAE enabled and with old incorrect trees
in the mix, there is a better way.  You can dynamically disable AAE with
some console commands. At that point, without stopping the nodes, you can
delete all AAE data across the cluster.  At a convenient time, re-enable
AAE.  I say convenient because all trees will start to rebuild, and that
can be problematic in an overloaded cluster.  Doing this over the weekend
might be a good idea unless your cluster can take the extra load.

To dynamically disable AAE from the Riak console, you can run this command:

> riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, disable, [],

and enable with the similar:

> riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, enable, [],

That last number is just a timeout for the RPC operation.  I hope this
saves you some extra load on your clusters.
(3) That’s going to be :
(3a) List all keys using the client of your choice
(3b) Fetch each object <> <>


> On 19 Sep 2017, at 18:31, Leo <scicomplete at> wrote:
> Dear Riak users and experts,
> I really appreciate any help with my questions below.
> I have a 3 node Riak cluster with each having approx. 1 TB disk usage.
> All of a sudden, one node's hard disk failed unrecoverably. So, I
> added a new node using the following steps:
> 1) riak-admin cluster join 2) down the failed node 3) riak-admin
> force-replace failed-node new-node 4) riak-admin cluster plan 5)
> riak-admin cluster commit.
> This almost fixed the problem except that after lots of data transfers
> and handoffs, now not all three nodes have 1 TB disk usage. Only two
> of them have 1 TB disk usage. The other one is almost empty (few 10s
> of GBs). This means there are no longer 3 copies on disk anymore. My
> data is completely random (no two keys have same data associated with
> them. So, compression of data cannot be the reason for less data on
> disk),
> I also tried using the "riak-admin cluster replace failednode newnode"
> command so that the leaving node handsoff data to the joining node.
> This however is not helpful if the leaving node has a failed hard
> disk. I want the remaining live vnodes to help the new node recreate
> the lost data using their replica copies.
> I have three questions:
> 1) What commands should I run to forcefully make sure there are three
> replicas on disk overall without waiting for read-repair or
> anti-entropy to make three copies ? Bandwidth usage or CPU usage is
> not a huge concern for me.
> 2) Also, I will be very grateful if someone lists the commands that I
> can run using "riak attach" so that I can clear the AAE trees and
> forcefully make sure all data has 3 copies.
> 3) I will be very thankful if someone helps me with the commands that
> I should run to ensure that all data has 3 replicas on disk after the
> disk failure (instead of just looking at the disk space usage in all
> the nodes as hints)?
> Thanks,
> Leo
> _______________________________________________
> riak-users mailing list
> riak-users at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list