Riak Cannot allocate bytes of memory (of type "heap")

Sakoulas, Byron ByronSakoulas at catholichealth.net
Mon Jan 25 13:10:17 EST 2016


We are running an 8 node cluster of riak at AWS, and our nodes are consistently crashing with the error - Cannot allocate x bytes of memory (of type "heap”).

Here are some of the specs for our env:

8 nodes - running on M3 Larges
Level DB with 50% allocated
Solr with 2Gig
We use only Immutable and CRDT data
We have a Custom search schema
System config matches basho recommendations
CentOs 7
Riak 2.0.2
Riak java client 2.0.0

Below is the console log leading up to the crash. I have also attached the erl_crash.dump file. Any help is greatly appreciated.

2016-01-25 16:34:16.822 [info] <0.2681.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during active anti-entropy exchange of {707914855582156101004909840846949587645842325504,3} between {730750818665451459101842416358141509827966271488,'riakaws at 172.16.65.8<mailto:'riakaws at 172.16.65.8>'} and {753586781748746817198774991869333432010090217472,'riakaws at 172.16.65.12<mailto:'riakaws at 172.16.65.12>'}
2016-01-25 16:34:56.867 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,180682}] [{old_heap_block_size,0},{heap_block_size,22177879},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,8755966}]
2016-01-25 16:35:00.231 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,203278}] [{old_heap_block_size,0},{heap_block_size,26613454},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,9839470}]
2016-01-25 16:35:08.857 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,256704}] [{old_heap_block_size,0},{heap_block_size,31936144},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,12371527}]
2016-01-25 16:35:15.731 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,299047}] [{old_heap_block_size,0},{heap_block_size,38323372},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,14501169}]
2016-01-25 16:35:21.285 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,330848}] [{old_heap_block_size,0},{heap_block_size,45988046},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,16029792}]
2016-01-25 16:35:36.034 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,382846}] [{old_heap_block_size,0},{heap_block_size,55185655},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,18521726}]
2016-01-25 16:35:49.409 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,455689}] [{old_heap_block_size,0},{heap_block_size,66222786},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,21841438}]
2016-01-25 16:35:59.878 [info] <0.71.0> alarm_handler: {set,{process_memory_high_watermark,<0.1369.0>}}
2016-01-25 16:36:00.267 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,515497}] [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,24737674}]
2016-01-25 16:36:08.497 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,should_insert,3}},{message_queue_len,560639}] [{old_heap_block_size,0},{heap_block_size,95360811},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,26973030}]
2016-01-25 16:36:34.806 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,691363}] [{old_heap_block_size,0},{heap_block_size,114432973},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,33336504}]
2016-01-25 16:36:55.523 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,809402}] [{old_heap_block_size,0},{heap_block_size,137319567},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,38698478}]
2016-01-25 16:37:10.427 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,897252}] [{old_heap_block_size,0},{heap_block_size,164783480},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,43053480}]
2016-01-25 16:37:46.837 [info] <0.10112.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during active anti-entropy exchange of {959110449498405040071168171470060731649205731328,3} between {959110449498405040071168171470060731649205731328,'riakaws at 172.16.65.8<mailto:'riakaws at 172.16.65.8>'} and {1004782375664995756265033322492444576013453623296,'riakaws at 172.16.65.13<mailto:'riakaws at 172.16.65.13>'}
2016-01-25 16:37:56.113 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,maybe_flush_buffer,1}},{message_queue_len,1132569}] [{old_heap_block_size,0},{heap_block_size,197740176},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,54503137}]
2016-01-25 16:38:29.550 [info] <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.1369.0> [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,1313878}] [{old_heap_block_size,0},{heap_block_size,237288211},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,62782512}]


erlang.log.1:
===== ALIVE Mon Jan 25 16:34:01 UTC 2016

===== Mon Jan 25 16:39:55 UTC 2016
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
[os_mon] memory supervisor port (memsup): Erlang has closed

Crash dump was written to: /var/log/riak/erl_crash.dump
eheap_alloc: Cannot allocate 2277966824 bytes of memory (of type "heap").



This email and attachments contain information that may be confidential or privileged. If you are not the intended recipient, notify the sender at once and delete this message completely from your information system. Further use, disclosure, or copying of information contained in this email is not authorized, and any such action should not be construed as a waiver of privilege or other confidentiality protections.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: erl_crash.dump
Type: application/octet-stream
Size: 8358497 bytes
Desc: erl_crash.dump
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160125/f770a0f6/attachment.dump>


More information about the riak-users mailing list