[error] Supervisor riak_pipe_vnode_worker_sup had child undefined started with ...

Mark Phillips mark at basho.com
Sun Jun 10 17:20:41 EDT 2012


Hi Ivaylo,

Take a look at this thread:

http://riak.markmail.org/search/?q=exit%20with%20reason%20fitting_died%20in%20context%20child_terminated#query:exit%20with%20reason%20fitting_died%20in%20context%20child_terminated+page:1+mid:n4gfl43hcvzthjl7+state:results

I think this is what you're seeing. You should read the entire message I
linked to, but the important thing is that the reason you're seeing
the "fitting_died
in context child_terminated" logs is due to a timeout with a Riak
Pipe-based M/R process. To paraphrase Bryan Fink, those messages are normal
and intended to help debug issues. Are you still seeing them?

I would be interested to know what type of MapReduce load you're putting on
your cluster. "4 machines x 1GB RAM" isn't a very powerful cluster and
MapReduce jobs (especially those written in java script) can tax Riak nodes
significantly. Anything details you can share?

Mark



On Wed, Jun 6, 2012 at 4:38 PM, Ivaylo Panitchkov
<ipanitchkov at hibernum.com>wrote:

>
> Hello everyone,
>
> We started getting the following errors on all servers in the cluster (4
> machines x 1GB RAM, riak_1.0.2-1_amd64.deb):
>
> 20:12:36.753 [error] Supervisor riak_pipe_vnode_worker_sup had child
> undefined started with {riak_pipe_vnode_worker,start_link,undefined} at
> <0.8855.0> exit with reason fitting_died in context child_terminated
> 20:12:36.754 [error] Supervisor riak_pipe_vnode_worker_sup had child
> undefined started with {riak_pipe_vnode_worker,start_link,undefined} at
> <0.8856.0> exit with reason fitting_died in context child_terminated
> 20:12:36.965 [error] Supervisor riak_pipe_vnode_worker_sup had child
> undefined started with {riak_pipe_vnode_worker,start_link,undefined} at
> <0.8860.0> exit with reason fitting_died in context child_terminated
> 20:12:36.967 [error] Supervisor riak_pipe_vnode_worker_sup had child
> undefined started with {riak_pipe_vnode_worker,start_link,undefined} at
> <0.8861.0> exit with reason fitting_died in context child_terminated
>
>
> If we restart the riak service on all machines one by one the error
> message disappears for a while.
> Any ideas to solve the issue will be much appreciated.
>
> Thanks in advance,
> Ivaylo
>
> REMARK: Replaced the IP addresses for security sake
>
> *root at riak01:~# riak-admin member_status*
> Attempting to restart script through sudo -u riak
> ================================= Membership
> ==================================
> Status     Ring    Pending    Node
>
> -------------------------------------------------------------------------------
> valid      25.0%      --      'riak at IP1'
> valid      25.0%      --      'riak at IP2'
> valid      25.0%      --      'riak at IP3'
> valid      25.0%      --      'riak at IP4'
>
> -------------------------------------------------------------------------------
> Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>
> *root at riak01:~# riak-admin ring_status*
> Attempting to restart script through sudo -u riak
> ================================== Claimant
> ===================================
> Claimant:  'riak at IP1'
> Status:     up
> Ring Ready: true
>
> ============================== Ownership Handoff
> ==============================
> No pending changes.
>
> ============================== Unreachable Nodes
> ==============================
> All nodes are up and reachable
>
> *root at riak01:~# riak-admin ringready*
> Attempting to restart script through sudo -u riak
> TRUE All nodes agree on the ring ['riak at IP1','riak at IP2','riak at IP3
> ','riak at IP4']
>
> *root at riak01:~# riak-admin transfers*
> Attempting to restart script through sudo -u riak
> No transfers active
>
> --
> Ivaylo Panitchkov
> Software developer
> Hibernum Creations Inc.
>
> Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en y répondant, puis supprimer ce message de votre système. Veuillez ne pas le copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu à quiconque.
> This email is confidential and may also be legally privileged. If you have received this email in error, please notify us immediately by reply email and then delete this message from your system. Please do not copy it or use it for any purpose or disclose its content.
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120610/813f509b/attachment.html>


More information about the riak-users mailing list