Adding nodes to cluster

Sargun Dhillon sargun at sargun.me
Sat Jan 24 16:09:49 EST 2015


Several things:
1) If you have data at rest that doesn't change, make sure you have
AAE, and it's ran before your cluster is manipulated. Given that
you're running at 85% space, I would be a little worried to turn it
on, because you might run out of disk space. You can also pretty
reasonably put the AAE trees on magnetic storage. AAE is nice in the
sense that you _know_ your cluster is consistent at a point in time.

2) Make sure you're getting SSDs of roughly the same quality. I've
seen enterprise SSDs get higher and higher latency as time goes on,
due to greater data protection features. We don't need any of that.
Basho_bench is your friend if you have the time.

3) Do it all in one go. This will enable handoffs more cleanly, and all at once.

4) Do not add the new nodes to the load balancer until handoff is
done. At least experimentally, latency increases slightly on the
original cluster, but the target nodes have pretty awful latency.

5) Start with a handoff_limit of 1. You can easily raise this. If
things look good, you can increase it. We're not optimizing for the
total time to handoff, we really should be optimizing for individual
vnode handoff time.

6) If you're using Leveldb, upgrade to the most recent version of Riak
1.4. There have been some improvements. 1.4.9 made me happier. I think
it's reasonable for the new nodes to start on 1.4.12, and the old
nodes to be switched over later.

7) Watch your network utilization. Keep your disk latency flat. Stop
it if it spikes. Start from enabling one node with the lowest usage
and see if it works.


These are the things I can think of immediately.

On Sat, Jan 24, 2015 at 12:42 PM, Alexander Sicular <siculars at gmail.com> wrote:
> I would probably add them all in one go so you have one vnode migration plan that gets executed. What is your ring size? How much data are we talking about? It's not necessarily the number of keys but rather the total amount of data and how quickly that data can move en mass between machines.
>
> -Alexander
>
>
> @siculars
> http://siculars.posthaven.com
>
> Sent from my iRotaryPhone
>
>> On Jan 24, 2015, at 15:37, Ed <edgarmveiga at gmail.com> wrote:
>>
>> Hi everyone!
>>
>> I have a riak cluster, working in production for about one year, with the following characteristics:
>> - Version 1.4.8
>> - 6 nodes
>> - leveldb backend
>> - replication (n) = 3
>> ~ 3 billion keys
>>
>> My ssd's are reaching 85% of capacity and we have decided to buy 6 more nodes to expand the cluster.
>>
>> Have you got any kind of advice on executing this operation or should I just follow the documentation on adding new nodes to a cluster?
>>
>> Best regards!
>> Edgar
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




More information about the riak-users mailing list