Warning "Can not start proc_lib:init_p"

Evan Vigil-McClanahan emcclanahan at basho.com
Wed Apr 3 11:12:20 EDT 2013


Ingo,

riak-admin status | grep sys_process_count

will tell you how many processes are running.  The default process
limit on erlang is a little low, and we'd suggest raising in
(especially with your extra-large ring_size).   Erlang processes are
cheap, so 65535 or even double that will be fine.

Busy dist ports are still worrying.  Are you monitoring object sizes?
Are there any spikes there associated with performance drops?

On Wed, Apr 3, 2013 at 8:03 AM, Ingo Rockel
<ingo.rockel at bluelionmobile.com> wrote:
> Hi Evan,
>
> I set swt very_low and zdbbl to 64MB, setting this params helped reducing
> the busy_dist_port and Monitor got {suppressed,... Messages a lot. But when
> the performance of the cluster suddenly drops we still see these messages.
>
> The cluster was updated to 1.3 in the meantime.
>
> The eleveldb section:
>
>  %% eLevelDB Config
>  {eleveldb, [
>              {data_root, "/var/lib/riak/leveldb"},
>              {cache_size, 33554432},
>              {write_buffer_size_min, 67108864}, %% 64 MB in bytes
>              {write_buffer_size_max, 134217728}, %% 128 MB in bytes
>              {max_open_files, 4000}
>             ]},
>
> the ring size is 1024 and the machines have 48GB of memory. Concerning the
> params from vm.args:
>
> -env ERL_MAX_PORTS 4096
> -env ERL_MAX_ETS_TABLES 8192
>
> +P isn't set
>
> Ingo
>
> Am 03.04.2013 16:53, schrieb Evan Vigil-McClanahan:
>
>> For your prior mail, I thought that someone had answered.  Our initial
>> suggestion was to add +swt very_low to your vm.args, as well as
>> setting the +zdbbl setting that Jon recommended in the list post you
>> pointed to.  If those help, moving to 1.3 should help more.
>>
>> Other limits in vm.args that can cause problems are +P, ERL_MAX_PORTS,
>> and  ERL_MAX_ETS_TABLES.  Are any of these set?  If so, to what?
>>
>> Can you also pate the eleveldb section of your app.config?
>>
>> On Wed, Apr 3, 2013 at 7:41 AM, Ingo Rockel
>> <ingo.rockel at bluelionmobile.com> wrote:
>>>
>>> Hi Evan,
>>>
>>> I'm not sure, I find a lot of these:
>>>
>>> 2013-03-30 23:27:52.992 [error]
>>> <0.8036.323>@riak_api_pb_server:handle_info:141 Unrecognized message
>>> {22243034,{error,timeout}}
>>>
>>> and some of these at the same time one of the kind below gets logged
>>> (although the one has a different time stamp):
>>>
>>> 2013-03-30 23:27:53.056 [error] <0.9457.323>@riak_kv_console:status:178
>>> Status failed error:terminated
>>>
>>> Ingo
>>>
>>> Am 03.04.2013 16:24, schrieb Evan Vigil-McClanahan:
>>>
>>>> Resending to the list:
>>>>
>>>> Ingo,
>>>>
>>>> That is an indication that the protocol buffers server can't spawn a
>>>> put fsm, which means that a put cannot be done for some reason or
>>>> another.  Are there any other messages that appear around this time
>>>> that might indicate why?
>>>>
>>>> On Wed, Apr 3, 2013 at 12:09 AM, Ingo Rockel
>>>> <ingo.rockel at bluelionmobile.com> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> we have some performance issues with our riak cluster, from time to
>>>>> time
>>>>> we
>>>>> have a sudden drop in performance (already asked the list about this,
>>>>> no-one
>>>>> had an idea though). Although not the same time but on the problematic
>>>>> nodes
>>>>> we have a lot of these messages from time to time:
>>>>>
>>>>> 2013-04-02 21:41:11.173 [warning] <0.25646.475> ** Can not start
>>>>> proc_lib:init_p
>>>>>
>>>>>
>>>>> ,[<0.14556.474>,[<0.9519.474>,riak_api_pb_sup,riak_api_sup,<0.1291.0>],riak_kv_p
>>>>>
>>>>>
>>>>> ut_fsm,start_link,[{raw,65032165,<0.9519.474>},{r_object,<<109>>,<<77,115,124,49
>>>>>
>>>>>
>>>>> ,53,55,57,56,57,56,50,124,49,51,54,52,57,51,49,54,49,49,53,49,50,52,53,54>>,[{r_
>>>>>
>>>>>
>>>>> content,{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
>>>>>
>>>>>
>>>>> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},<<>>}],[],{dict,2,16,16,8,8
>>>>>
>>>>>
>>>>> 0,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[]
>>>>>
>>>>>
>>>>> ,[],[],[[<<99,111,110,116,101,110,116,45,116,121,112,101>>,97,112,112,108,105,99
>>>>>
>>>>>
>>>>> ,97,116,105,111,110,47,106,115,111,110]],[],[],[],[],[[<<99,104,97,114,115,101,1
>>>>>
>>>>>
>>>>> 16>>,85,84,70,45,56]]}}},<<123,34,115,116,34,58,50,44,34,116,34,58,49,44,34,99,3
>>>>>
>>>>>
>>>>> 4,58,34,66,117,116,32,115,104,101,32,105,115,32,103,111,110,101,44,32,110,32,101
>>>>>
>>>>>
>>>>> ,118,101,110,32,116,104,111,117,103,104,32,105,109,32,110,111,116,32,105,110,32,
>>>>>
>>>>>
>>>>> 117,114,32,99,105,116,121,32,105,32,108,111,118,101,32,117,32,110,100,32,105,32,
>>>>>
>>>>>
>>>>> 109,101,97,110,32,105,116,32,58,39,40,34,44,34,114,34,58,49,52,51,52,54,52,51,57
>>>>>
>>>>>
>>>>> ,44,34,115,34,58,49,53,55,57,56,57,56,50,44,34,99,116,34,58,49,51,54,52,57,51,49
>>>>>
>>>>>
>>>>> ,54,49,49,53,49,50,44,34,97,110,34,58,102,97,108,115,101,44,34,115,107,34,58,49,
>>>>>
>>>>>
>>>>> 51,54,52,57,51,49,54,49,49,53,49,50,52,53,54,44,34,115,117,34,58,48,125>>},[{tim
>>>>> eout,60000}]]] on 'riak at 172.22.3.12' **
>>>>>
>>>>> Can anyone explain to me what these messages mean and if I need to do
>>>>> something about it? Could these messages be in any way related to the
>>>>> performance issues?
>>>>>
>>>>> Ingo
>>>>>
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>>
>>>
>>> --
>>> Software Architect
>>>
>>> Blue Lion mobile GmbH
>>> Tel. +49 (0) 221 788 797 14
>>> Fax. +49 (0) 221 788 797 19
>>> Mob. +49 (0) 176 24 87 30 89
>>>
>>> ingo.rockel at bluelionmobile.com
>>>>>>
>>>>>> qeep: Hefferwolf
>>>
>>>
>>> www.bluelionmobile.com
>>> www.qeep.net
>
>
>
> --
> Software Architect
>
> Blue Lion mobile GmbH
> Tel. +49 (0) 221 788 797 14
> Fax. +49 (0) 221 788 797 19
> Mob. +49 (0) 176 24 87 30 89
>
> ingo.rockel at bluelionmobile.com
>>>> qeep: Hefferwolf
>
> www.bluelionmobile.com
> www.qeep.net




More information about the riak-users mailing list