Riak Cannot allocate bytes of memory (of type "heap")

Luke Bakken lbakken at basho.com
Mon Jan 25 14:17:41 EST 2016


Hello Byron -

m3.large instances only support 7.5 GiB of RAM. You can see that Riak
crashed while attempting to allocate 2.12 GiB of RAM for leveldb.

I suggest decreasing jvm (Solr) RAM back to the 1GiB setting that
ships with Riak. You can also experiment with disabling Active
Anti-Entropy to reduce memory usage. Hopefully someone with more
experience with Riak Search (Yokozuna) interaction with Active
Anti-Entropy will chime in on this thread.

Or, increase the amount of RAM available to these VMs.

Thanks

--
Luke Bakken
Engineer
lbakken at basho.com


On Mon, Jan 25, 2016 at 10:10 AM, Sakoulas, Byron
<ByronSakoulas at catholichealth.net> wrote:
> We are running an 8 node cluster of riak at AWS, and our nodes are consistently crashing with the error - Cannot allocate x bytes of memory (of type "heap”).
>
> Here are some of the specs for our env:
>
> 8 nodes - running on M3 Larges
> Level DB with 50% allocated
> Solr with 2Gig
> We use only Immutable and CRDT data
> We have a Custom search schema
> System config matches basho recommendations
> CentOs 7
> Riak 2.0.2
> Riak java client 2.0.0
>
> Below is the console log leading up to the crash. I have also attached the erl_crash.dump file. Any help is greatly appreciated.
>
> 2016-01-25 16:34:16.822 [info] <0.2681.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during active anti-entropy exchange of {707914855582156101004909840846949587645842325504,3} between {730750818665451459101842416358141509827966271488,'riakaws at 172.16.65.8<mailto:'riakaws at 172.16.65.8>'} and {753586781748746817198774991869333432010090217472,'riakaws at 172.16.65.12<mailto:'riakaws at 172.16.65.12>'}
> 2016-01-25 16:34:56.867 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,180682}] [{old_heap_block_size,0},{heap_block_size,22177879},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,8755966}]
> 2016-01-25 16:35:00.231 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,203278}] [{old_heap_block_size,0},{heap_block_size,26613454},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,9839470}]
> 2016-01-25 16:35:08.857 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,256704}] [{old_heap_block_size,0},{heap_block_size,31936144},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,12371527}]
> 2016-01-25 16:35:15.731 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,299047}] [{old_heap_block_size,0},{heap_block_size,38323372},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,14501169}]
> 2016-01-25 16:35:21.285 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,330848}] [{old_heap_block_size,0},{heap_block_size,45988046},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,16029792}]
> 2016-01-25 16:35:36.034 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,382846}] [{old_heap_block_size,0},{heap_block_size,55185655},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,18521726}]
> 2016-01-25 16:35:49.409 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,455689}] [{old_heap_block_size,0},{heap_block_size,66222786},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,21841438}]
> 2016-01-25 16:35:59.878 [info] <0.71.0> alarm_handler: {set,{process_memory_high_watermark,<0.1369.0>}}
> 2016-01-25 16:36:00.267 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,515497}] [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,24737674}]
> 2016-01-25 16:36:08.497 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,should_insert,3}},{message_queue_len,560639}] [{old_heap_block_size,0},{heap_block_size,95360811},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,26973030}]
> 2016-01-25 16:36:34.806 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,691363}] [{old_heap_block_size,0},{heap_block_size,114432973},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,33336504}]
> 2016-01-25 16:36:55.523 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,809402}] [{old_heap_block_size,0},{heap_block_size,137319567},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,38698478}]
> 2016-01-25 16:37:10.427 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,897252}] [{old_heap_block_size,0},{heap_block_size,164783480},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,43053480}]
> 2016-01-25 16:37:46.837 [info] <0.10112.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during active anti-entropy exchange of {959110449498405040071168171470060731649205731328,3} between {959110449498405040071168171470060731649205731328,'riakaws at 172.16.65.8<mailto:'riakaws at 172.16.65.8>'} and {1004782375664995756265033322492444576013453623296,'riakaws at 172.16.65.13<mailto:'riakaws at 172.16.65.13>'}
> 2016-01-25 16:37:56.113 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,maybe_flush_buffer,1}},{message_queue_len,1132569}] [{old_heap_block_size,0},{heap_block_size,197740176},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,54503137}]
> 2016-01-25 16:38:29.550 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,1313878}] [{old_heap_block_size,0},{heap_block_size,237288211},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,62782512}]
>
>
> erlang.log.1:
> ===== ALIVE Mon Jan 25 16:34:01 UTC 2016
>
> ===== Mon Jan 25 16:39:55 UTC 2016
> [os_mon] cpu supervisor port (cpu_sup): Erlang has closed
> [os_mon] memory supervisor port (memsup): Erlang has closed
>
> Crash dump was written to: /var/log/riak/erl_crash.dump
> eheap_alloc: Cannot allocate 2277966824 bytes of memory (of type "heap").
>
>
>
> This email and attachments contain information that may be confidential or privileged. If you are not the intended recipient, notify the sender at once and delete this message completely from your information system. Further use, disclosure, or copying of information contained in this email is not authorized, and any such action should not be construed as a waiver of privilege or other confidentiality protections.
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list