riak-0.14.2 memory crash

Sylvain Niles sylvain.niles at gmail.com
Fri Jul 8 20:29:56 EDT 2011


Double checked and yes that's sorted by Stack+heap. There's two ets
tables that are 8 and 9mb respectively and that's all I can find for
memory usage. Anywhere else I can look for the culprit if it's not a
process or ets table?



On Fri, Jul 8, 2011 at 5:20 PM, Dan Reverri <dan at basho.com> wrote:
> The Stack+heap for that process is only ~0.17 GB which isn't that much. Are
> you sorting the process list by Reductions or Stack+heap?
> Regarding the messages in queue, I think you should be able to explore each
> Pid and look at it's process dictionary. The process dictionary will have a
> messages property.
> Thanks,
> Dan
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> dan at basho.com
>
>
> On Fri, Jul 8, 2011 at 5:03 PM, Sylvain Niles <sylvain.niles at gmail.com>
> wrote:
>>
>> Thanks Dan, very useful tool. Here's the big offender:
>>
>> Pid             Name/Spawned as State   Reductions       Stack+heap
>> MsgQ Length
>> <0.163.0>       riak_kv_map_master      Garbing (limited info)
>>  23815122240     182452560       3
>>
>>
>> So it looks like there was a huge amount of data in the process queue
>> for the riak_kv_map_master. The pid info for that process shows there
>> were only 3 messages in its queue. Any tricks for figuring out what
>> those messages looked like?
>>
>> -Sylvain
>>
>>
>> On Fri, Jul 8, 2011 at 4:42 PM, Daniel Reverri <dan at basho.com> wrote:
>> > Can you check if an erl_crash.dump file was created? The crash dump
>> > should give some indication of which processes were taking up memory.
>> >
>> > The crashdump_viewer built into Erlang is very useful for reviewing
>> > crash dumps.
>> >
>> > Thanks
>> > Dan
>> >
>> > Sent from my iPhone
>> >
>> > On Jul 8, 2011, at 4:18 PM, Sylvain Niles <sylvain.niles at gmail.com>
>> > wrote:
>> >
>> >> Our system had been humming along fine for a week and crashed today
>> >> with almost no load. This is the only thing in the erlang.log:
>> >>
>> >>
>> >> =INFO REPORT==== 8-Jul-2011::16:46:14 ===
>> >> [{alarm_handler,{clear,system_memory_high_watermark}}]
>> >> =INFO REPORT==== 8-Jul-2011::16:50:14 ===
>> >> [{alarm_handler,{set,{system_memory_high_watermark,[]}}}]
>> >> =INFO REPORT==== 8-Jul-2011::16:51:14 ===
>> >>
>> >> [{alarm_handler,{clear,system_memory_high_watermark}}]/usr/local/src/riak-0.14.2/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
>> >> Erlang has closed.
>> >> Erlang has closed
>> >>
>> >> Crash dump was written to: erl_crash.dump
>> >> eheap_alloc: Cannot allocate 1824525600 bytes of memory (of type
>> >> "heap").
>> >>
>> >>
>> >> Looking in the sasl-error log I see that an erlang function I wrote
>> >> was crashing in some cases:
>> >>
>> >> ** Reason for termination = ** {error,       {phase_error,
>> >> {error,               {error,undef,
>> >> [{get_erl,past_events,                        [{r_object,<<"events">>,
>> >>                            <<"LRs2PibmMZknPm44IMzoZflCHUP">>,
>> >>                             [{r_content,
>> >> ...ETC
>> >>
>> >>
>> >> So I definitely need to fix my code, but I'm wondering if this is
>> >> expected behavior that this would cause a memory leak that eventually
>> >> brings down riak? I thought if the process running my module crashed
>> >> everything would just be garbage collected and that'd be the end of
>> >> it. Any advice on the right way to approach this would be great,
>> >> thanks!
>> >>
>> >> -Sylvain
>> >>
>> >> _______________________________________________
>> >> riak-users mailing list
>> >> riak-users at lists.basho.com
>> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>
>




More information about the riak-users mailing list