Simple 3 node test cluster eating all my memory

Ciprian Manea ciprian at basho.com
Fri Feb 13 03:17:14 EST 2015


Hi Simon,

Unfortunately we seem to have the same bug [0] mentioned by Juan Luis also
in the (1.4.9) release you're testing against. A fix is already being
worked on and will be available in an upcoming Riak release.

[0] https://github.com/basho/riak_kv/issues/1064

Thank you,
Ciprian

On Thu, Feb 12, 2015 at 3:19 PM, Simon Hartley <
Simon.Hartley at williamhill.com> wrote:

>  Brilliant Thanks.
>
>
>
> Is there an equivalent document and spreadsheet (and link form one to the
> other) for the in-memory backend. Given we are not using LevelDB I’ve never
> read that part of the docs.
>
>
>
> Given we have limited the memory per in-memory storage instance (i.e. per
> vnode) to 32MB (in the max_memory setting), and that we would expect ~43
> vnodes per server (128 ring size / 3 nodes = ~43 vnodes per server), that
> gives a maximum backend memory usage of 43 * 32MB = 1376MB in normal
> operation.
>
>
>
> This is significantly less than the amount of memory we see Riak trying to
> grab. Also the spiking behaviour of the memory usage is still unexplained.
>
>
>
> How do we estimate the memory requirements of the remainder of the vnode
> system (i.e. everything but the storage component)?
>
>
>
> Thanks,
>
>
>
> Simon.
>
>
>
> *From:* Ciprian Manea [mailto:ciprian at basho.com]
> *Sent:* 12 February 2015 12:56
>
> *To:* Simon Hartley
> *Cc:* riak-users at lists.basho.com
> *Subject:* Re: Simple 3 node test cluster eating all my memory
>
>
>
> Hi Simon,
>
>
>
> The spreadsheet is referenced from the LevelDB's parameter planning [0].
>
>
>
> A ring_size defines the number of vnodes (virtual nodes) a riak cluster
> runs internally, and as each virtual node is implemented as an Erlang
> process, the bigger the ring_size is, the more memory is required from the
> operating system.
>
>
>
> [0]
> http://docs.basho.com/riak/1.4.12/ops/advanced/backends/leveldb/#Parameter-Planning
>
>
>
>
>
> Regards,
>
> Ciprian
>
>
>
> On Thu, Feb 12, 2015 at 1:44 PM, Simon Hartley <
> Simon.Hartley at williamhill.com> wrote:
>
>  Hi Ciprian,
>
>
>
> Thanks for the answer.
>
>
>
> According to Riak doc “Cluster-Capacity-Planning” (
> http://docs.basho.com/riak/1.4.9/ops/building/planning/cluster/#Ring-Size-Number-of-Partitions
> )
>
>
>
> “The default number of partitions in a Riak cluster is 64. This works for
> smaller clusters, but if you plan to grow your cluster past 5 nodes it is
> recommended you consider a larger ring size.”
>
>
>
> In our staging and production environments we have 5 nodes, and this will
> likely grow, so we chose a ring assize of 128. We like to keep the same
> config across all out environments where possible, so we replicated this in
> the 3-node test environment.
>
>
>
> There is nothing in this document to suggest a upper limit on ring size
> based on number of nodes (or capacity of individual nodes). From where in
> the Riak docs is this spreadsheet referenced?
>
>
>
> I can rebuild the cluster with ring size 16 if necessary, but can you
> explain why the current larger ring size produces the sudden memory spike
> and subsequent crash?
>
>
>
> Thanks,
>
>
> Simon.
>
>
>
> *From:* Ciprian Manea [mailto:ciprian at basho.com]
> *Sent:* 12 February 2015 11:26
> *To:* Simon Hartley
> *Cc:* riak-users at lists.basho.com
> *Subject:* Re: Simple 3 node test cluster eating all my memory
>
>
>
> Hi Simon,
>
>
>
> Looking at this problem from another angle, a ring size of 128 is too
> large for just 3 servers with 4 GB RAM each. For instance when dimensioning
> a cluster with LevelDB backend we recommend our customers to observe the
> calculations on this spreadsheet [0].
>
>
>
> Filling the above spreadsheet with your system's details (3 nodes, 4 GB
> RAM) we get a ring-size of 16 for the riak cluster. Would you be able to
> recreate the cluster with ring-size 16 and test again?
>
>
>
> [0]
> https://docs.google.com/spreadsheet/ccc?key=0AnW_U8Qe8NdYdGk2V3Qza0VRNkxyRFNGUVVCV0c3V3c&usp=sharing#gid=0
>
>
>
>
>
> Thanks,
>
> Ciprian
>
>
>
> On Thu, Feb 12, 2015 at 11:51 AM, Simon Hartley <
> Simon.Hartley at williamhill.com> wrote:
>
>  Hi,
>
>
>
> We have a simple 3 node test cluster, and we are seeing this cluster
> fall-over under quite modest loads, 2-3 times a day with the node reporting
> out of memory problems.
>
>
>
> Our basic details are:
>
>
>
> ·         Riak 1.4.9
>
> ·         3 nodes, each being:
>
> o   A virtual RedHat EL
>
> o   2 x 2.2GHz CPU
>
> o   4GB RAM
>
> ·         In-memory back end
>
> ·         128 segment ring
>
>
>
> We have tried to limit the in-memory max memory by the following setting
> in app.config:
>
>
>
> %% Memory Config
>
> {memory_backend, [
>
>              {max_memory, 32}, %% 32 megabytes
>
>              {ttl, 86400}  %% 1 Day in seconds
>
>            ]},
>
>
>
> The behaviour we are seeing is a sudden and rapid increase in memory
> allocated to riak, up to 100% available RAM, and then a crash.
>
>
>
> This is happening on all 3 nodes (at different times).
>
>
>
> When looking through the console.log just prior to the out of memory /
> crash we see log entries similar to the following appearing:
>
>
>
> 2015-02-11 16:50:27.356 [info]
> <0.98.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap
> <0.238.0>
> [{name,riak_core_gossip},{initial_call,{riak_core_gossip,init,1}},{almost_current_function,{riak_core_gossip,update_gossip_vers
>
> ion,1}},{message_queue_len,78}]
> [{old_heap_block_size,0},{heap_block_size,47828850},{mbuf_size,0},{stack_size,15},{old_heap_size,0},{heap_size,13944475}]
>
>
>
> We can’t see any errors at the appropriate times in the error.log
>
>
>
> We see the following in erlang.log.1:
>
>
>
> ===== ALIVE Wed Feb 11 17:00:24 GMT 2015
>
> /usr/lib64/riak/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has closed.
>
> Erlang has closed
>
>
>
> Crash dump was written to: /var/log/riak/erl_crash.dump
>
> eheap_alloc: Cannot allocate 4454408120 bytes of memory
>
>
>
> =====
>
>
>
> Anyone any ideas?
>
>
>
> *Simon Hartley*
>
> Solutions Architect
>
>
>
> Email: Simon.Hartley at williamhill.com
>
> Skype: *+44 (0)113 397 6747 <%2B44%20%280%29113%20397%206747>*
>
> Skype: *sijomons*
>
>
>
> *William Hill Online*, St. Johns, Merrion St. Leeds, LS2 8LQ
>
> [image: Description: Description: Description:
> cid:image002.png at 01CC2FFA.24244CF0]
>
>
>
> Confidentiality: The contents of this e-mail and any attachments
> transmitted with it are intended to be confidential to the intended
> recipient; and may be privileged or otherwise protected from disclosure. If
> you are not an intended recipient of this e-mail, do not duplicate or
> redistribute it by any means. Please delete it and any attachments and
> notify the sender that you have received it in error. This e-mail is sent
> by a William Hill PLC group company. The William Hill group companies
> include, among others, William Hill PLC (registered number 4212563),
> William Hill Organization Limited (registered number 278208), William Hill
> US HoldCo Inc, WHG (International) Limited (registered number 99191) and
> WHG Trading Limited (registered number 101439). Each of William Hill PLC,
> William Hill Organization Limited is registered in England and Wales and
> has its registered office at Greenside House, 50 Station Road, Wood Green,
> London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive,
> Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of
> WHG (International) Limited and WHG Trading Limited is registered in
> Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar.
> Unless specifically indicated otherwise, the contents of this e-mail are
> subject to contract; and are not an official statement, and do not
> necessarily represent the views, of William Hill PLC, its subsidiaries or
> affiliated companies. Please note that neither William Hill PLC, nor its
> subsidiaries and affiliated companies can accept any responsibility for any
> viruses contained within this e-mail and it is your responsibility to scan
> any emails and their attachments. William Hill PLC, its subsidiaries and
> affiliated companies may monitor e-mail traffic data and also the content
> of e-mails for effective operation of the e-mail system, or for security,
> purposes..
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
> Confidentiality: The contents of this e-mail and any attachments
> transmitted with it are intended to be confidential to the intended
> recipient; and may be privileged or otherwise protected from disclosure. If
> you are not an intended recipient of this e-mail, do not duplicate or
> redistribute it by any means. Please delete it and any attachments and
> notify the sender that you have received it in error. This e-mail is sent
> by a William Hill PLC group company. The William Hill group companies
> include, among others, William Hill PLC (registered number 4212563),
> William Hill Organization Limited (registered number 278208), William Hill
> US HoldCo Inc, WHG (International) Limited (registered number 99191) and
> WHG Trading Limited (registered number 101439). Each of William Hill PLC,
> William Hill Organization Limited is registered in England and Wales and
> has its registered office at Greenside House, 50 Station Road, Wood Green,
> London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive,
> Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of
> WHG (International) Limited and WHG Trading Limited is registered in
> Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar.
> Unless specifically indicated otherwise, the contents of this e-mail are
> subject to contract; and are not an official statement, and do not
> necessarily represent the views, of William Hill PLC, its subsidiaries or
> affiliated companies. Please note that neither William Hill PLC, nor its
> subsidiaries and affiliated companies can accept any responsibility for any
> viruses contained within this e-mail and it is your responsibility to scan
> any emails and their attachments. William Hill PLC, its subsidiaries and
> affiliated companies may monitor e-mail traffic data and also the content
> of e-mails for effective operation of the e-mail system, or for security,
> purposes..
>
>
>  Confidentiality: The contents of this e-mail and any attachments
> transmitted with it are intended to be confidential to the intended
> recipient; and may be privileged or otherwise protected from disclosure. If
> you are not an intended recipient of this e-mail, do not duplicate or
> redistribute it by any means. Please delete it and any attachments and
> notify the sender that you have received it in error. This e-mail is sent
> by a William Hill PLC group company. The William Hill group companies
> include, among others, William Hill PLC (registered number 4212563),
> William Hill Organization Limited (registered number 278208), William Hill
> US HoldCo Inc, WHG (International) Limited (registered number 99191) and
> WHG Trading Limited (registered number 101439). Each of William Hill PLC,
> William Hill Organization Limited is registered in England and Wales and
> has its registered office at Greenside House, 50 Station Road, Wood Green,
> London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive,
> Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of
> WHG (International) Limited and WHG Trading Limited is registered in
> Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar.
> Unless specifically indicated otherwise, the contents of this e-mail are
> subject to contract; and are not an official statement, and do not
> necessarily represent the views, of William Hill PLC, its subsidiaries or
> affiliated companies. Please note that neither William Hill PLC, nor its
> subsidiaries and affiliated companies can accept any responsibility for any
> viruses contained within this e-mail and it is your responsibility to scan
> any emails and their attachments. William Hill PLC, its subsidiaries and
> affiliated companies may monitor e-mail traffic data and also the content
> of e-mails for effective operation of the e-mail system, or for security,
> purposes..
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150213/56270189/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 7250 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150213/56270189/attachment.png>


More information about the riak-users mailing list