riak-0.14.2 memory crash

Sylvain Niles sylvain.niles at gmail.com
Sun Jul 10 02:01:44 EDT 2011


Unfortunately not the case for us. Every document is under 2k in size
and are almost all identical with just some differences in text.
Unless you mean the documents grew to that size for some conflict
scenario?

-Sylvain


On Sat, Jul 9, 2011 at 10:55 PM, Jeff Pollard <jeff.pollard at gmail.com> wrote:
> We had a similar problem recently where we had several large documents
> (500MB - 1.1gigs) that was causing Erlang to crash with eheap_alloc errors.
>  Likely, the documents we so large due to high amounts of conflict.
>  Deleting the documents fixed the problem.
>
> On Fri, Jul 8, 2011 at 5:29 PM, Sylvain Niles <sylvain.niles at gmail.com>
> wrote:
>>
>> Double checked and yes that's sorted by Stack+heap. There's two ets
>> tables that are 8 and 9mb respectively and that's all I can find for
>> memory usage. Anywhere else I can look for the culprit if it's not a
>> process or ets table?
>>
>>
>>
>> On Fri, Jul 8, 2011 at 5:20 PM, Dan Reverri <dan at basho.com> wrote:
>> > The Stack+heap for that process is only ~0.17 GB which isn't that much.
>> > Are
>> > you sorting the process list by Reductions or Stack+heap?
>> > Regarding the messages in queue, I think you should be able to explore
>> > each
>> > Pid and look at it's process dictionary. The process dictionary will
>> > have a
>> > messages property.
>> > Thanks,
>> > Dan
>> > Daniel Reverri
>> > Developer Advocate
>> > Basho Technologies, Inc.
>> > dan at basho.com
>> >
>> >
>> > On Fri, Jul 8, 2011 at 5:03 PM, Sylvain Niles <sylvain.niles at gmail.com>
>> > wrote:
>> >>
>> >> Thanks Dan, very useful tool. Here's the big offender:
>> >>
>> >> Pid             Name/Spawned as State   Reductions       Stack+heap
>> >> MsgQ Length
>> >> <0.163.0>       riak_kv_map_master      Garbing (limited info)
>> >>  23815122240     182452560       3
>> >>
>> >>
>> >> So it looks like there was a huge amount of data in the process queue
>> >> for the riak_kv_map_master. The pid info for that process shows there
>> >> were only 3 messages in its queue. Any tricks for figuring out what
>> >> those messages looked like?
>> >>
>> >> -Sylvain
>> >>
>> >>
>> >> On Fri, Jul 8, 2011 at 4:42 PM, Daniel Reverri <dan at basho.com> wrote:
>> >> > Can you check if an erl_crash.dump file was created? The crash dump
>> >> > should give some indication of which processes were taking up memory.
>> >> >
>> >> > The crashdump_viewer built into Erlang is very useful for reviewing
>> >> > crash dumps.
>> >> >
>> >> > Thanks
>> >> > Dan
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> > On Jul 8, 2011, at 4:18 PM, Sylvain Niles <sylvain.niles at gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Our system had been humming along fine for a week and crashed today
>> >> >> with almost no load. This is the only thing in the erlang.log:
>> >> >>
>> >> >>
>> >> >> =INFO REPORT==== 8-Jul-2011::16:46:14 ===
>> >> >> [{alarm_handler,{clear,system_memory_high_watermark}}]
>> >> >> =INFO REPORT==== 8-Jul-2011::16:50:14 ===
>> >> >> [{alarm_handler,{set,{system_memory_high_watermark,[]}}}]
>> >> >> =INFO REPORT==== 8-Jul-2011::16:51:14 ===
>> >> >>
>> >> >>
>> >> >> [{alarm_handler,{clear,system_memory_high_watermark}}]/usr/local/src/riak-0.14.2/rel/riak/lib/os_mon-2.2.5/priv/bin/memsup:
>> >> >> Erlang has closed.
>> >> >> Erlang has closed
>> >> >>
>> >> >> Crash dump was written to: erl_crash.dump
>> >> >> eheap_alloc: Cannot allocate 1824525600 bytes of memory (of type
>> >> >> "heap").
>> >> >>
>> >> >>
>> >> >> Looking in the sasl-error log I see that an erlang function I wrote
>> >> >> was crashing in some cases:
>> >> >>
>> >> >> ** Reason for termination = ** {error,       {phase_error,
>> >> >> {error,               {error,undef,
>> >> >> [{get_erl,past_events,
>> >> >>  [{r_object,<<"events">>,
>> >> >>                            <<"LRs2PibmMZknPm44IMzoZflCHUP">>,
>> >> >>                             [{r_content,
>> >> >> ...ETC
>> >> >>
>> >> >>
>> >> >> So I definitely need to fix my code, but I'm wondering if this is
>> >> >> expected behavior that this would cause a memory leak that
>> >> >> eventually
>> >> >> brings down riak? I thought if the process running my module crashed
>> >> >> everything would just be garbage collected and that'd be the end of
>> >> >> it. Any advice on the right way to approach this would be great,
>> >> >> thanks!
>> >> >>
>> >> >> -Sylvain
>> >> >>
>> >> >> _______________________________________________
>> >> >> riak-users mailing list
>> >> >> riak-users at lists.basho.com
>> >> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >> >
>> >
>> >
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>




More information about the riak-users mailing list