Simple 3 node test cluster eating all my memory

Simon Hartley Simon.Hartley at williamhill.com
Thu Feb 12 06:44:26 EST 2015


Hi Ciprian,

Thanks for the answer.

According to Riak doc “Cluster-Capacity-Planning” (http://docs.basho.com/riak/1.4.9/ops/building/planning/cluster/#Ring-Size-Number-of-Partitions)

“The default number of partitions in a Riak cluster is 64. This works for smaller clusters, but if you plan to grow your cluster past 5 nodes it is recommended you consider a larger ring size.”

In our staging and production environments we have 5 nodes, and this will likely grow, so we chose a ring assize of 128. We like to keep the same config across all out environments where possible, so we replicated this in the 3-node test environment.

There is nothing in this document to suggest a upper limit on ring size based on number of nodes (or capacity of individual nodes). From where in the Riak docs is this spreadsheet referenced?

I can rebuild the cluster with ring size 16 if necessary, but can you explain why the current larger ring size produces the sudden memory spike and subsequent crash?

Thanks,

Simon.

From: Ciprian Manea [mailto:ciprian at basho.com]
Sent: 12 February 2015 11:26
To: Simon Hartley
Cc: riak-users at lists.basho.com
Subject: Re: Simple 3 node test cluster eating all my memory

Hi Simon,

Looking at this problem from another angle, a ring size of 128 is too large for just 3 servers with 4 GB RAM each. For instance when dimensioning a cluster with LevelDB backend we recommend our customers to observe the calculations on this spreadsheet [0].

Filling the above spreadsheet with your system's details (3 nodes, 4 GB RAM) we get a ring-size of 16 for the riak cluster. Would you be able to recreate the cluster with ring-size 16 and test again?

[0] https://docs.google.com/spreadsheet/ccc?key=0AnW_U8Qe8NdYdGk2V3Qza0VRNkxyRFNGUVVCV0c3V3c&usp=sharing#gid=0


Thanks,
Ciprian

On Thu, Feb 12, 2015 at 11:51 AM, Simon Hartley <Simon.Hartley at williamhill.com<mailto:Simon.Hartley at williamhill.com>> wrote:
Hi,

We have a simple 3 node test cluster, and we are seeing this cluster fall-over under quite modest loads, 2-3 times a day with the node reporting out of memory problems.

Our basic details are:


•         Riak 1.4.9

•         3 nodes, each being:

o   A virtual RedHat EL

o   2 x 2.2GHz CPU

o   4GB RAM

•         In-memory back end

•         128 segment ring

We have tried to limit the in-memory max memory by the following setting in app.config:

%% Memory Config
{memory_backend, [
             {max_memory, 32}, %% 32 megabytes
             {ttl, 86400}  %% 1 Day in seconds
           ]},

The behaviour we are seeing is a sudden and rapid increase in memory allocated to riak, up to 100% available RAM, and then a crash.

This is happening on all 3 nodes (at different times).

When looking through the console.log just prior to the out of memory / crash we see log entries similar to the following appearing:

2015-02-11 16:50:27.356 [info] <0.98.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.238.0> [{name,riak_core_gossip},{initial_call,{riak_core_gossip,init,1}},{almost_current_function,{riak_core_gossip,update_gossip_vers
ion,1}},{message_queue_len,78}] [{old_heap_block_size,0},{heap_block_size,47828850},{mbuf_size,0},{stack_size,15},{old_heap_size,0},{heap_size,13944475}]

We can’t see any errors at the appropriate times in the error.log

We see the following in erlang.log.1:

===== ALIVE Wed Feb 11 17:00:24 GMT 2015
/usr/lib64/riak/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has closed.
Erlang has closed

Crash dump was written to: /var/log/riak/erl_crash.dump
eheap_alloc: Cannot allocate 4454408120<tel:4454408120> bytes of memory

=====

Anyone any ideas?

Simon Hartley
Solutions Architect

Email: Simon.Hartley at williamhill.com<mailto:Simon.Hartley at williamhill.com>
Skype: +44 (0)113 397 6747<tel:%2B44%20%280%29113%20397%206747>
Skype: sijomons

William Hill Online, St. Johns, Merrion St. Leeds, LS2 8LQ
[Description: Description: Description: cid:image002.png at 01CC2FFA.24244CF0]

Confidentiality: The contents of this e-mail and any attachments transmitted with it are intended to be confidential to the intended recipient; and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. This e-mail is sent by a William Hill PLC group company. The William Hill group companies include, among others, William Hill PLC (registered number 4212563), William Hill Organization Limited (registered number 278208), William Hill US HoldCo Inc, WHG (International) Limited (registered number 99191) and WHG Trading Limited (registered number 101439). Each of William Hill PLC, William Hill Organization Limited is registered in England and Wales and has its registered office at Greenside House, 50 Station Road, Wood Green, London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive, Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of WHG (International) Limited and WHG Trading Limited is registered in Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar. Unless specifically indicated otherwise, the contents of this e-mail are subject to contract; and are not an official statement, and do not necessarily represent the views, of William Hill PLC, its subsidiaries or affiliated companies. Please note that neither William Hill PLC, nor its subsidiaries and affiliated companies can accept any responsibility for any viruses contained within this e-mail and it is your responsibility to scan any emails and their attachments. William Hill PLC, its subsidiaries and affiliated companies may monitor e-mail traffic data and also the content of e-mails for effective operation of the e-mail system, or for security, purposes..

_______________________________________________
riak-users mailing list
riak-users at lists.basho.com<mailto:riak-users at lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Confidentiality: The contents of this e-mail and any attachments transmitted with it are intended to be confidential to the intended recipient; and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. This e-mail is sent by a William Hill PLC group company. The William Hill group companies include, among others, William Hill PLC (registered number 4212563), William Hill Organization Limited (registered number 278208), William Hill US HoldCo Inc, WHG (International) Limited (registered number 99191) and WHG Trading Limited (registered number 101439). Each of William Hill PLC, William Hill Organization Limited is registered in England and Wales and has its registered office at Greenside House, 50 Station Road, Wood Green, London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive, Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of WHG (International) Limited and WHG Trading Limited is registered in Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar. Unless specifically indicated otherwise, the contents of this e-mail are subject to contract; and are not an official statement, and do not necessarily represent the views, of William Hill PLC, its subsidiaries or affiliated companies. Please note that neither William Hill PLC, nor its subsidiaries and affiliated companies can accept any responsibility for any viruses contained within this e-mail and it is your responsibility to scan any emails and their attachments. William Hill PLC, its subsidiaries and affiliated companies may monitor e-mail traffic data and also the content of e-mails for effective operation of the e-mail system, or for security, purposes..
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150212/dc60c4ee/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 7250 bytes
Desc: image001.png
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150212/dc60c4ee/attachment.png>


More information about the riak-users mailing list