Riak Enterprise: can it be used to migrate to a new configuration?
Rune Skou Larsen
rsl at trifork.com
Sat Oct 20 03:55:12 EDT 2012
Yet another good reason to keep ring size small, is the IO cost of 2i lookups, which is almost proportional with the number of partitions. This is because a fixed fraction of all partitions are queried when doing 2i.
Yes, you are correct Evan. We started out with a ring size of 256 on both clusters. After wiping and reconfiguring the first cluster to 128 partitions, we used our own Trifork in-house replication to load data.
Then we wiped and reconfigured the second cluster to 128 and used Basho's Riak Enterprise full sync to replicate all data from the first cluster to the second. All on a live, critical system.
I believe Basho has been working on how to grow/shrink ring size without wiping data - perhaps Basho can shed some light onto the status of this.
Until Riak can grow/shrink ring size or Riak Enterprise supports replication between clusters of different ring sizes, you need another mechanism for moving data when doing ring size reconfiguration. Trifork can help with this.
Evan Vigil-McClanahan <emcclanahan at basho.com> skrev:
64 is fine for a 6 node cluster. Rune gives a great rundown of the
downsides of large rings on small numbers of machines in his post.
Usually our recommendation is for ~10 ring partitions per physical
machine, rounded up to the next power of two. Where did you see the
recommendation for 512 from us?
Basho's replication won't work in the situation that you've described.
Are you talking about an in-house replication product? Our full-sync
doesn't work between clusters of different ring sizes.
On Fri, Oct 19, 2012 at 4:50 AM, Rune Skou Larsen <rsl at trifork.com> wrote:
> Yes, we have done excatly that. When we migrated from 256 to 128 partitions
> in a live dual-cluster system, we took one cluster down. Wiped the data,
> changed number of partitions, brought it back up and synced all data back
> with a full sync. Then we did the same with the other cluster.
> However, I must disagree with the recomendation of 512 partitions for 5
> nodes. You should go for 128 or 256 unless you plan on scaling out to 10+
> nodes pr. cluster.
> There are downsides to having many partitions. The price of the higher
> granularity is that the more storage backend processes use more resources
> for housekeeping. If you do multibackend, the ressources used are multiplied
> yet again with the number of backends because each vnode will have a number
> of running backend processes.
> Say you go with the 512 partitions and have a multibackend config with 4
> backends, because you need to backup 4 different types of data
> independently. That gives you 2k running backends on each node of which 412
> will be actively in use in normal running scenario and more when you're
> doing handoff. Thats a lot of ressources just to run these, that you might
> otherwise have used for doing business.
> When you increase the number of partitions you should consider:
> - Number of open files. Especially when using eleveldb.
> - Late triggering of bitcask compaction. The default is no compaction of any
> file before it hits 2GB. That means up to 2G of dead space per vnode. This
> can however be configured down to a smaller number than the 2 gigs, which is
> crazy high in almost any use case involving delete, expiry or update of
> - Leveldb cache is pr. vnode, so you need to lower the number, in order to
> not use all memory, which will lead to death by swapping.
> - With a high number of vnodes pr. node, each vnode's leveldb cache will be
> comparatively small leading to (slighty) less effecient cache usage.
> Please be in touch if you need onsite or offsite professional assistance
> configuring, testing or running your Riak clusters.
> BR Rune Skou Larsen
> - We do Riak PS.
> Best regards / Venlig hilsen
> Rune Skou Larsen
> Trifork Public A/S / Team Riak
> Margrethepladsen 4, 8000 Århus C, Denmark
> Phone: +45 3160 2497 Skype: runeskoularsen twitter: @RuneSkouLarsen
> Den 19-10-2012 12:38, Dave Brady skrev:
> Can Riak Enterprise replicate between rings where each ring has a different
> number of partitions?
> Our five-node ring was originally configured with 64 partitions, and I saw
> that Basho is recommending 512 for that number of machines.
> Any ideas on how to make as-painless-a-migration-as-possible are welcome, of
> Dave Brady
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users