Having to raise VM number-of-processes limit

Dave Brady dbrady at weborama.com
Tue Apr 2 10:31:04 EDT 2013


It happened again today, though I was not available to watch it at the time.

Three nodes each showed riak_kv being stopped for one minute:

2013-04-02 11:10:57.923 [info] <0.2833.1447>@riak_kv_app:check_kv_health:239 Disabling riak_kv due to large message queues. Offending vnodes: [{319703483166135013357056057156686910549735243776,5798}]
2013-04-02 11:11:57.924 [info] <0.3589.1447>@riak_kv_app:check_kv_health:242 Re-enabling riak_kv after successful health check

--
Dave Brady

----- Original Message -----
From: "Dave Brady" <dbrady at weborama.com>
To: "Evan Vigil-McClanahan" <emcclanahan at basho.com>
Cc: riak-users at lists.basho.com
Sent: Monday, April 1, 2013 11:15:47 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: Having to raise VM number-of-processes limit

Hi Evan,

Thanks for the suggestions!

I did not think that raising that limit was normal.  Glad to have confirmation.

I'll go through the logs again, and run 'riak-admin top ...' the next time it happens.

--
Dave Brady

----- Original Message -----
From: "Evan Vigil-McClanahan" <emcclanahan at basho.com>
To: "Dave Brady" <dbrady at weborama.com>
Cc: riak-users at lists.basho.com
Sent: Saturday, March 30, 2013 11:03:30 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: Having to raise VM number-of-processes limit

Dave,

If you're seeing the process count go that high, it suggests to me
that something else is wrong.  Typically, even for heavily loaded
clusters, hundreds of thousands of processes isn't normal.  Is there
anything else in the logs?

When a node sees this sort of behavior start, does riak-admin top
-sort msg_q look like?

On Sat, Mar 30, 2013 at 2:07 PM, Dave Brady <dbrady at weborama.com> wrote:
> Hello,
>
> I have run into a situation whereby I started seeing:
>
> [error] emulator Too many processes
>
> when some of our new jobs ran.  These jobs are in perl using Net::Riak,
> communicating to the cluster via PBC.  They fire tens of thousands of fetchs
> and stores over the course of about 20 minutes.
>
> Our cluster has five nodes with 1.3, using eLevelDB.
>
> I have been raising the limit (+P in vm.args) in increments from the default
> of 32768.  Currently at 524288, and that is still not high enough.
>
> Have any of you had to increase this limit?
>
> Thanks!
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
riak-users at lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


More information about the riak-users mailing list