riak-0.14.2 memory crash

Dan Reverri dan at basho.com
Fri Jul 8 20:20:45 EDT 2011


The Stack+heap for that process is only ~0.17 GB which isn't that much. Are
you sorting the process list by Reductions or Stack+heap?

Regarding the messages in queue, I think you should be able to explore each
Pid and look at it's process dictionary. The process dictionary will have a
messages property.

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
dan at basho.com


On Fri, Jul 8, 2011 at 5:03 PM, Sylvain Niles <sylvain.niles at gmail.com>wrote:

> Thanks Dan, very useful tool. Here's the big offender:
>
> Pid             Name/Spawned as State   Reductions       Stack+heap
> MsgQ Length
> <0.163.0>       riak_kv_map_master      Garbing (limited info)  23815122240
>     182452560       3
>
>
> So it looks like there was a huge amount of data in the process queue
> for the riak_kv_map_master. The pid info for that process shows there
> were only 3 messages in its queue. Any tricks for figuring out what
> those messages looked like?
>
> -Sylvain
>
>
> On Fri, Jul 8, 2011 at 4:42 PM, Daniel Reverri <dan at basho.com> wrote:
> > Can you check if an erl_crash.dump file was created? The crash dump
> should give some indication of which processes were taking up memory.
> >
> > The crashdump_viewer built into Erlang is very useful for reviewing crash
> dumps.
> >
> > Thanks
> > Dan
> >
> > Sent from my iPhone
> >
> > On Jul 8, 2011, at 4:18 PM, Sylvain Niles <sylvain.niles at gmail.com>
> wrote:
> >
> >> Our system had been humming along fine for a week and crashed today
> >> with almost no load. This is the only thing in the erlang.log:
> >>
> >>
> >> =INFO REPORT==== 8-Jul-2011::16:46:14 ===
> >> [{alarm_handler,{clear,system_memory_high_watermark}}]
> >> =INFO REPORT==== 8-Jul-2011::16:50:14 ===
> >> [{alarm_handler,{set,{system_memory_high_watermark,[]}}}]
> >> =INFO REPORT==== 8-Jul-2011::16:51:14 ===
> >>
> [{alarm_handler,{clear,system_memory_high_watermark}}]/usr/local/src/riak-0.14.2/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
> >> Erlang has closed.
> >> Erlang has closed
> >>
> >> Crash dump was written to: erl_crash.dump
> >> eheap_alloc: Cannot allocate 1824525600 bytes of memory (of type
> "heap").
> >>
> >>
> >> Looking in the sasl-error log I see that an erlang function I wrote
> >> was crashing in some cases:
> >>
> >> ** Reason for termination = ** {error,       {phase_error,
> >> {error,               {error,undef,
> >> [{get_erl,past_events,                        [{r_object,<<"events">>,
> >>                            <<"LRs2PibmMZknPm44IMzoZflCHUP">>,
> >>                             [{r_content,
> >> ...ETC
> >>
> >>
> >> So I definitely need to fix my code, but I'm wondering if this is
> >> expected behavior that this would cause a memory leak that eventually
> >> brings down riak? I thought if the process running my module crashed
> >> everything would just be garbage collected and that'd be the end of
> >> it. Any advice on the right way to approach this would be great,
> >> thanks!
> >>
> >> -Sylvain
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110708/f6efa505/attachment.html>


More information about the riak-users mailing list