riak-0.14.2 memory crash

Sylvain Niles sylvain.niles at gmail.com
Fri Jul 8 20:03:31 EDT 2011


Thanks Dan, very useful tool. Here's the big offender:

Pid	        Name/Spawned as	State	Reductions	 Stack+heap	MsgQ Length
<0.163.0>	riak_kv_map_master	Garbing (limited info)	23815122240	182452560	3


So it looks like there was a huge amount of data in the process queue
for the riak_kv_map_master. The pid info for that process shows there
were only 3 messages in its queue. Any tricks for figuring out what
those messages looked like?

-Sylvain


On Fri, Jul 8, 2011 at 4:42 PM, Daniel Reverri <dan at basho.com> wrote:
> Can you check if an erl_crash.dump file was created? The crash dump should give some indication of which processes were taking up memory.
>
> The crashdump_viewer built into Erlang is very useful for reviewing crash dumps.
>
> Thanks
> Dan
>
> Sent from my iPhone
>
> On Jul 8, 2011, at 4:18 PM, Sylvain Niles <sylvain.niles at gmail.com> wrote:
>
>> Our system had been humming along fine for a week and crashed today
>> with almost no load. This is the only thing in the erlang.log:
>>
>>
>> =INFO REPORT==== 8-Jul-2011::16:46:14 ===
>> [{alarm_handler,{clear,system_memory_high_watermark}}]
>> =INFO REPORT==== 8-Jul-2011::16:50:14 ===
>> [{alarm_handler,{set,{system_memory_high_watermark,[]}}}]
>> =INFO REPORT==== 8-Jul-2011::16:51:14 ===
>> [{alarm_handler,{clear,system_memory_high_watermark}}]/usr/local/src/riak-0.14.2/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
>> Erlang has closed.
>> Erlang has closed
>>
>> Crash dump was written to: erl_crash.dump
>> eheap_alloc: Cannot allocate 1824525600 bytes of memory (of type "heap").
>>
>>
>> Looking in the sasl-error log I see that an erlang function I wrote
>> was crashing in some cases:
>>
>> ** Reason for termination = ** {error,       {phase_error,
>> {error,               {error,undef,
>> [{get_erl,past_events,                        [{r_object,<<"events">>,
>>                            <<"LRs2PibmMZknPm44IMzoZflCHUP">>,
>>                             [{r_content,
>> ...ETC
>>
>>
>> So I definitely need to fix my code, but I'm wondering if this is
>> expected behavior that this would cause a memory leak that eventually
>> brings down riak? I thought if the process running my module crashed
>> everything would just be garbage collected and that'd be the end of
>> it. Any advice on the right way to approach this would be great,
>> thanks!
>>
>> -Sylvain
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list