Adding nodes to cluster

Edgar Veiga edgarmveiga at gmail.com
Fri Feb 6 01:53:11 EST 2015


It is expected that the total amount of data per node lowers quite a lot, correct? I'm doubling the size of the cluster (6 more nodes).




I ask this because the actual 6 machines have 1.5Tb in disks, but the new ones ( for now) have only 1Tb.




Best regards



—
Sent from my iPhone

On Sat, Jan 24, 2015 at 9:49 PM, Edgar Veiga <edgarmveiga at gmail.com>
wrote:

> Yeah, after sending the email I realized both! :)
> Thanks! Have a nice weekend
> On 24 January 2015 at 21:46, Sargun Dhillon <sargun at sargun.me> wrote:
>> 1) Potentially re-enable AAE after migration. As your cluster gets
>> bigger, the likelihood of any node failing in the cluster goes up.
>> Replica divergence only becomes scarier in light of this. Losing data
>> != awesome.
>>
>> 6) There shouldn't be any problems, but for safe measures you should
>> probably upgrade the old ones before the migration.
>>
>>
>>
>> On Sat, Jan 24, 2015 at 1:31 PM, Edgar Veiga <edgarmveiga at gmail.com>
>> wrote:
>> > Sargun,
>> >
>> > Regarding 1) - AAE is disabled. We had a problems with it and there's a
>> lot
>> > of threads here in the mailing list regarding this. AAE won't stop using
>> > more and more disk space and the only solution was disabling it! Since
>> then
>> > the cluster has been pretty stable...
>> >
>> > Regarding 6) Can you or anyone in basho confirm that there won't be any
>> > problems using the latest (1.4.12) version of riak in the new nodes and
>> only
>> > upgrading the old ones after this process is completed?
>> >
>> > Thanks a lot for the other tips, you've been very helpful!
>> >
>> > Best regards,
>> > Edgar
>> >
>> > On 24 January 2015 at 21:09, Sargun Dhillon <sargun at sargun.me> wrote:
>> >>
>> >> Several things:
>> >> 1) If you have data at rest that doesn't change, make sure you have
>> >> AAE, and it's ran before your cluster is manipulated. Given that
>> >> you're running at 85% space, I would be a little worried to turn it
>> >> on, because you might run out of disk space. You can also pretty
>> >> reasonably put the AAE trees on magnetic storage. AAE is nice in the
>> >> sense that you _know_ your cluster is consistent at a point in time.
>> >>
>> >> 2) Make sure you're getting SSDs of roughly the same quality. I've
>> >> seen enterprise SSDs get higher and higher latency as time goes on,
>> >> due to greater data protection features. We don't need any of that.
>> >> Basho_bench is your friend if you have the time.
>> >>
>> >> 3) Do it all in one go. This will enable handoffs more cleanly, and all
>> at
>> >> once.
>> >>
>> >> 4) Do not add the new nodes to the load balancer until handoff is
>> >> done. At least experimentally, latency increases slightly on the
>> >> original cluster, but the target nodes have pretty awful latency.
>> >>
>> >> 5) Start with a handoff_limit of 1. You can easily raise this. If
>> >> things look good, you can increase it. We're not optimizing for the
>> >> total time to handoff, we really should be optimizing for individual
>> >> vnode handoff time.
>> >>
>> >> 6) If you're using Leveldb, upgrade to the most recent version of Riak
>> >> 1.4. There have been some improvements. 1.4.9 made me happier. I think
>> >> it's reasonable for the new nodes to start on 1.4.12, and the old
>> >> nodes to be switched over later.
>> >>
>> >> 7) Watch your network utilization. Keep your disk latency flat. Stop
>> >> it if it spikes. Start from enabling one node with the lowest usage
>> >> and see if it works.
>> >>
>> >>
>> >> These are the things I can think of immediately.
>> >>
>> >> On Sat, Jan 24, 2015 at 12:42 PM, Alexander Sicular <siculars at gmail.com
>> >
>> >> wrote:
>> >> > I would probably add them all in one go so you have one vnode
>> migration
>> >> > plan that gets executed. What is your ring size? How much data are we
>> >> > talking about? It's not necessarily the number of keys but rather the
>> total
>> >> > amount of data and how quickly that data can move en mass between
>> machines.
>> >> >
>> >> > -Alexander
>> >> >
>> >> >
>> >> > @siculars
>> >> > http://siculars.posthaven.com
>> >> >
>> >> > Sent from my iRotaryPhone
>> >> >
>> >> >> On Jan 24, 2015, at 15:37, Ed <edgarmveiga at gmail.com> wrote:
>> >> >>
>> >> >> Hi everyone!
>> >> >>
>> >> >> I have a riak cluster, working in production for about one year, with
>> >> >> the following characteristics:
>> >> >> - Version 1.4.8
>> >> >> - 6 nodes
>> >> >> - leveldb backend
>> >> >> - replication (n) = 3
>> >> >> ~ 3 billion keys
>> >> >>
>> >> >> My ssd's are reaching 85% of capacity and we have decided to buy 6
>> more
>> >> >> nodes to expand the cluster.
>> >> >>
>> >> >> Have you got any kind of advice on executing this operation or
>> should I
>> >> >> just follow the documentation on adding new nodes to a cluster?
>> >> >>
>> >> >> Best regards!
>> >> >> Edgar
>> >> >>
>> >> >> _______________________________________________
>> >> >> riak-users mailing list
>> >> >> riak-users at lists.basho.com
>> >> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >> >
>> >> > _______________________________________________
>> >> > riak-users mailing list
>> >> > riak-users at lists.basho.com
>> >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>> >
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150205/f64d75ef/attachment-0002.html>


More information about the riak-users mailing list