Cluster rebalancing

John Daily jdaily at basho.com
Mon Jul 7 08:53:21 EDT 2014


Weighted claim is indeed on the wishlist, but I don't believe it has yet
been assigned to any particular planned release.

-John


On Sun, Jul 6, 2014 at 9:09 AM, Thomas Santero <tsantero at gmail.com> wrote:

> Hi Chaim,
>
> Inline
>
> On Jul 6, 2014, at 1:13 AM, Chaim Solomon <chaim at itcentralstation.com>
> wrote:
>
> I don't think I was quite clear in what I asked for.
>
>
> My apologies for misunderstanding your previous query.
>
>
> I am not asking for the ability to influence the hashing algorithm. That
> would be a mess.
> But I would like to be able to have more influence on the distribution of
> vnodes on the nodes - and that is something that RIAK already does.
>
> So a command to bump a vnode off a particular node or reduce the number of
> vnodes on a node or set the target percentage on a node would be nice. It
> seems like the current algorithm already does something similar - but I
> didn't see how one can influence that.
>
> The other issue was that I would suggest taking the disk space into
> consideration.
> If you have nodes that have different storage then balancing the data
> equally between nodes may not be the best option.
> It may be better to take the available disk space into consideration and
> move vnodes to nodes that have free space if a node runs low on space.
>
>
> What you refer to here would be nice, and is something referred to as
> "weighted claim." I know it's been discussed a bit in the past. Perhaps
> someone from Basho can chime in and let us know if it's on the roadmap for
> a future release?
>
>
> One simple use case would be expanding a cluster with newer nodes (that
> have more storage) and being able to utilise that storage.
>
> Another would be to be able to distribute larger partitions more evenly -
> in particular if the size per partition is not evenly distributed.
>
> Chaim Solomon
>
>
>
> On Thu, Jul 3, 2014 at 8:51 PM, Tom Santero <tsantero at gmail.com> wrote:
>
>> responses inline
>>
>>
>> On Thu, Jul 3, 2014 at 2:45 AM, Chaim Solomon <chaim at itcentralstation.com
>> > wrote:
>>
>>> Hi,
>>>
>>> I'm running a 2.0.0b cluster (small) and have been running out of space
>>> on one node.
>>> I had expected that adding a node would lead to freeing up of space on
>>> other nodes - but it's not working too fast.
>>>
>>
>> Keep in mind that the speed of transfers is bound by the bandwidth
>> available on the network as well as the speed at which you can actually
>> read the data off disk. Once the transfers complete you should see the disk
>> freed.
>>
>>
>>>
>>> I would suggest to add to RIAK a way to have the distribution algorithm
>>> take free space into consideration and to move data to empty nodes fast.
>>> Another issue is that adding the node moved most nodes from 25% to 18.8% -
>>> but one stayed on 25% in the planner.
>>>
>>
>> The algorithm Riak uses to determine vnode placement is
>> non-deterministic; if you don't like any given staged vnode distribution I
>> might suggest you run riak-admin cluster clear to undo any staged changed
>> and attempt to add the node again, until you're content with the new plan.
>>
>>
>>>
>>> And I would also suggest adding some way to force a rebalancing of the
>>> cluster to force nodes to take up more load if they don't have enough or
>>> hand off load to others.
>>>
>>
>> The hashing algorithm used by Riak to determine object placement in the
>> ring is uniform--over time and with a greater number of total keys you'll
>> start to see a smoother distribution across all partitions.
>>
>> On the fly rebalancing would be incredibly expensive, especially for
>> users who have lots of nodes and petabytes of data stored in Riak. Ad-hoc
>> partition handoff would most likely be brittle and error-prone, given the
>> unreliability of the network.
>>
>> In my humble opinion the engineers at Basho work harder than most other
>> distributed systems developers, considering all the edge cases where
>> systems can fail unexpectedly; I say this not to boost their egos, but
>> rather to point out that their approach has the effect of making Riak more
>> robust and resilient than most other distributed datastores. But such
>> resiliency isn't free, and for these guarantees every user must pay the
>> price. Riak might not be the fastest database, and it may even underutilize
>> that really expensive hardware you might throw at it...but i'll be damned
>> if it doesn't lie to me, lose my data or pretend that failures like network
>> partitions don't happen.
>>
>>
>>>
>>> Chaim Solomon
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140707/c3d2f62e/attachment.html>


More information about the riak-users mailing list