Very (very) slow handoff, how to investigate?

Matthew Tovbin matthew at tovbin.com
Thu May 24 15:22:44 EDT 2012


Guys,

Thanks for the tips!! Helpful indeed.

-Matthew



On Tue, Jan 31, 2012 at 2:32 AM, Gal Barnea <gal at eyeviewdigital.com> wrote:

> Guys
> Thanks a lot for the helpful pointers
>
> I decided to focus more on speeding the process of joining servers to the
> cluster, where it is easier to monitor disk space during the handoff
> (">watch df -B M" and "dstat -dn -D") and deduct the actual handoff
> progress (When partitions are big enough, there is very little indication
> of their handoff progress in riak's logs)
>
> Increasing the handoff_cuncurrency did indeed help push the handoff rate
> much higher
> Also, running on EC2 I was able to setup RAID0 on multiple ephemeral
> drives which helped me reach rates of around 35-40 MB/s - practically the
> IO limit set by amazon
>
> Thanks guys!
>
>
> On Fri, Jan 27, 2012 at 9:26 PM, Joseph Blomstedt <joe at basho.com> wrote:
>
>> Gal,
>>
>> 0.5 to 1 MB/s is indeed painfully slow.
>>
>> A few questions:
>> What backend are you running: bitcask,leveldb, etc?
>> Are you using the local ephemeral storage, or running off EBS?
>> Are you running any software RAID?
>> What filesystem are you running?
>> Which OS are you using?
>> Have you changed any of Riak's default settings?
>>
>> Also, any chance you could provide the output of "iostat -x" during
>> one of these long handoff sessions? Preferably on both the sending and
>> receiving nodes.
>>
>> The more information we have, the better we can try to help out here.
>>
>> Regards,
>> Joe
>>
>> On Fri, Jan 27, 2012 at 7:44 AM, Ian Plosker <ian at basho.com> wrote:
>> > Gal,
>> >
>> > You could try using `riak attach` and running the following to increase
>> the
>> > handoff_concurrency from 1 to 4:
>> >
>> > application:set_env(riak_core, handoff_concurrency, 4).
>> >
>> > You will need to do this on all nodes. This will only remain in effect
>> as
>> > long as the nodes remain running. If you wish to permanently increase
>> the
>> > handoff concurrency you will have to do so in the app.config.
>> >
>> > --
>> > Ian Plosker <ian at basho.com>
>> > Developer Advocate
>> > Basho Technologies
>> >
>> > On Friday, January 27, 2012 at 7:45 AM, Gal Barnea wrote:
>> >
>> > Hi Ian
>> >
>> > Thanks for the informative answer, I am using 1.0.3 indeed.
>> >
>> > A day later, the cluster is making progress, but than I saw this in the
>> > console.log:
>> > 2012-01-27 08:51:32.643 [info]
>> > <0.30733.2881>@riak_core_handoff_sender:start_fold:87 Handoff of
>> partition
>> > riak_kv_vnode 50239118783249787813
>> > 2516661246222288006726811648 from
>> > 'riak at ec2-107-21-156-59.compute-1.amazonaws.com' to
>> > 'riak at ec2-leaving.compute-1.amazonaws.com' completed: sent 5100479
>> objects
>> > in 10596.49 seconds
>> >
>> > so we've dropped 50% in rate and are now less than 500 records/second !
>> >
>> > Frankly, I think this is problematic any way you look at it...If I need
>> to
>> > wait days every time I manually remove a server from the cluster, it
>> isn't
>> > really a valid solution from my perspective.
>> >
>> > Any thoughts?
>> >
>> > Regards
>> > Gal
>> >
>> >
>> > On Thu, Jan 26, 2012 at 11:35 PM, Ian Plosker <ian at basho.com> wrote:
>> >
>> > Gal,
>> >
>> > The limiting factor on EC2 will likely be IOPs (i.e. Disk throughput).
>> EC2
>> > is a IOPs constrained environment, especially if you're using EBS.
>> Further,
>> > doing a leave can induce a large number of ownership changes to ensure
>> that
>> > preflists maintain the appropriate n_vals. The number of partitions that
>> > need to be shuffled can exceed 80% of all partitions. In short, it can
>> take
>> > a while for the rebalance to complete. Assuming you're using a >=1.0
>> > release, you're cluster should still correctly respond to all incoming
>> > requests.
>> >
>> > Which version of Riak are you using? As of Riak 1.0.3,
>> > `handoff_concurrency`, the number of outgoing handoffs per node, is set
>> to
>> > 1. This will reduce the rate at which the rebalance occurs, but it
>> reduces
>> > the impact of the rebalance on your cluster.
>> >
>> > --
>> > Ian Plosker <ian at basho.com>
>> > Developer Advocate
>> > Basho Technologies, Inc.
>> >
>> > On Thursday, January 26, 2012 at 3:43 PM, Gal Barnea wrote:
>> >
>> > Ok, so now I can see in the "leaving" node logs:
>> > 2012-01-26 19:18:23.015 [info]
>> > <0.32148.2873>@riak_core_handoff_sender:start_fold:39 Starting handoff
>> of
>> > partition riak_kv_vnode 685078892498860742907977265335757665463718379520
>> > from 'riak at ec2-leaving.compute-1.amazonaws.com' to
>> > 'riak at ec2-othernode.compute-1.amazonaws.com'
>> > 2012-01-26 19:24:17.798 [info] <0.31620.2873> alarm_handler:
>> > {set,{system_memory_high_watermark,[]}}
>> > 2012-01-26 20:23:28.991 [info]
>> > <0.32148.2873>@riak_core_handoff_sender:start_fold:87 Handoff of
>> partition
>> > riak_kv_vnode 685078892498860742907977265335757665463718379520 from
>> > 'riak at leaving.compute-1.amazonaws.com' to
>> > 'riak at ec2-othernode.compute-1.amazonaws.com' completed: sent 5110665
>> objects
>> > in 3905.97 seconds
>> >
>> > so things *are* moving but at a rate of 1308 records per second.
>> > This sounds very slow to me, accounting for the small record size, the
>> high
>> > bw rate inside ec2 and practically 0% load on the servers
>> >
>> > any thoughts?
>> >
>> >
>> >
>> > On Thu, Jan 26, 2012 at 10:12 PM, Gal Barnea <gal at eyeviewdigital.com>
>> wrote:
>> >
>> > Hi all
>> >
>> > I have a 6 server cluster running on ec2 (m1.large) - this is an
>> evaluation
>> > environment, so practically no load besides the existing data
>> > (~200 million records, ~1k each)
>> >
>> > after running "riak-admin leave" on one of the node, I noticed that for
>> more
>> > than 3 hours
>> > 1 - member_status showed that there is one "leaving" node and pending
>> data
>> > to handoff on the rest but the numbers never changed
>> > 2 - riak-admin transfers -  showed handoffs waiting, but nothing changed
>> >
>> > at this point, I restarted the "leaving" node, so now the status is
>> > 1 - member_status - still stuck with the same numbers
>> > 2 - transfers - are slowly changing
>> >
>> > The leaving server's logs are showing that a single handoff started
>> after
>> > the restart,but nothing since (roughly an hour ago)
>> >
>> > Interestingly, the leaving server is pretty idle while the remaining
>> servers
>> > are working hard at 50%-60% cpu
>> >
>> > so, the question now is where should I dig around to try and understand
>> > what's going on. Any thoughts?
>> >
>> > Thanks
>> > Gal
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>>
>>
>>
>> --
>> Joseph Blomstedt <joe at basho.com>
>> Software Engineer
>> Basho Technologies, Inc.
>> http://www.basho.com/
>>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120524/8b771d96/attachment.html>


More information about the riak-users mailing list