node_put_fsm_active maxing out

Ted Burghart ted.ml.basho at tedb.net
Fri Nov 6 08:59:25 EST 2015


I’ve added a comment to https://github.com/basho/riak/issues/789 <https://github.com/basho/riak/issues/789>, but will follow up here as well.

On 64-bit platforms, HiPE is generally a shaky proposition.  On Linux, specifically, there are workarounds in the VM to allow it to (hopefully) run safely, but it’s somewhat constrained on beefy 64-bit hardware, and unlikely to yield significant performance increases in many cases.  Turning it off should never destabilize the system - it’s turning it on you have to be careful about.  I’d be surprised if turning off HiPE was a factor here.

As I noted in the GH issue, there are version inconsistencies in the information you included there, and I’m more inclined to focus on that than on the VM.

– Ted

> On 5-Nov 2015, at 8:54 PM, Chris Read wrote:
> 
> I had a look at that again this evening after I sent that last email, then looked again at the VM's. The other thing that jumped out at me then was that the system that's been working for the last year has hipe enabled, the problem one does not. My limited understanding of Erlang runtimes feels like that could have something to do with it...
> 
> Chris
> 
> On Thu, Nov 5, 2015 at 6:55 PM, Ted Burghart wrote:
> Hi Chris,
> 
> This is pretty unusual - basho6 just turns on frame pointers, which should have a (likely immeasurable) impact on performance, but otherwise should be really innocuous.
> 
> Along with Doug’s question about what specific OS you’re on, are you sure the change from basho5 to basho6 is the only change in your execution environment?
> 
> – Ted
> 
> Ted Burghart
> Senior Engineer
> Basho Technologies  http://www.basho <http://www.basho/>.com
> 
>> On 5-Nov 2015, at 6:21 PM, Chris Read wrote:
>> 
>> Anyone out there?
>> 
>> Here's some more detail on the Erlang builds:
>> 
>> This one works as expected:
>> 
>> sys_system_architecture : <<"x86_64-unknown-linux-gnu">>
>> sys_system_version : <<"Erlang R16B02-basho5 (erts-5.10.3) [source] [64-bit] [smp:8:8] [async-threads:64] [hipe] [kernel-poll:true]">>
>> 
>> This one has the problem:
>> 
>> sys_system_architecture : <<"x86_64-unknown-linux-gnu">>
>> sys_system_version : <<"Erlang R16B02_basho6 (erts-5.10.3) [source-bcd8abb] [64-bit] [smp:24:24] [async-threads:64] [kernel-poll:true] [frame-pointer]">>
>> 
>> Chris
>> 
>> 
>> On Tue, Nov 3, 2015 at 12:47 PM, Chris Read wrote:
>> Greetings all...
>> 
>> We've been building riak from source for a while, but I've had trouble getting the 2.1 lines built reliably and so would like to revert back to using the .deb package. The problem I have is that in our test environment we always manage to max out node_put_fsm_active under sustained write loads, and they never drop.
>> 
>> When running riak 2.0.4 on R16B02-basho5 (our current prod version) everything is working as expected. 
>> 
>> Using the .deb package of 2.0.4 pushes us to R16B02_basho6, which is where we see the problem arrive of node_puts_fsm_active going up and never dropping back own again, even after the write load stops.
>> 
>> Further testing with the riak 2.0.6 2.1.1 .deb packages (both contain R16B02_basho8) show the same problem.
>> 
>> Questions I have are:
>> 
>> 1) Anyone else seen this?
>> 2) Is there any way I can see why these FSM's appear to be deadlocked?
>> 
>> Thanks,
>> 
>> Chris
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151106/4b4554b5/attachment-0002.html>


More information about the riak-users mailing list