Riak CS: avoiding RAM overflow and OOM killer

Daniel Miller dmiller at dimagi.com
Tue Nov 22 09:51:47 EST 2016


Hi Alexander,

Thanks for responding.

> How many nodes?

We currently have 9 nodes in our cluster.

> How much ram per node?

Each node has 4GB of ram and 4GB of swap. The memory levels (ram + swap) on
each node are currently between 4GB and 5.5GB.

> How many objects (files)? What is the average file size?

We currently have >30 million objects, and I analyzed the average object
size before we migrated data into the cluster it was about 4KB/object, with
some objects being much larger (multiple MB). Is there an easy way to get
this information from a running cluster so I can give you more accurate
information?


On Tue, Nov 22, 2016 at 2:42 AM, Alexander Sicular <siculars at gmail.com>
wrote:

> Hi Daniel,
>
> How many nodes?
> -You should be using 5 minimum if you using the default config. There
> are reasons.
>
> How much ram per node?
> -As you noted, in Riak CS, 1MB file chunks are stored in bitcask.
> Their key names and some overhead consume memory.
>
> How many objects (files)? What is the average file size?
> -If your size distribution significantly skews < 1MB that means you
> will have a bunch of files in bitcask eating up ram.
>
> Kota was a former Basho engineer who worked on CS... That said, Basho
> may not support a non standard deployment.
>
> -Alexander
>
> On Mon, Nov 21, 2016 at 2:45 PM, Daniel Miller <dmiller at dimagi.com> wrote:
> > I found a similar question from over a year ago
> > (http://lists.basho.com/pipermail/riak-users_lists.
> basho.com/2015-July/017327.html),
> > and it sounds like leveldb is the way to go, although possibly not well
> > tested. Has anything changed with regard to Basho's (or anyone else)
> > experience with using leveldb backend instead of the mutli backend for
> CS?
> >
> > On Fri, Nov 4, 2016 at 11:48 AM, Daniel Miller <dmiller at dimagi.com>
> wrote:
> >>
> >> Hi,
> >>
> >> I have a Riak CS cluster up and running, and am anticipating exponential
> >> growth in the number of key/value pairs over the next few years. From
> >> reading the documentation and experience, I've concluded that the
> default
> >> configuration of CS (with riak_cs_kv_multi_backend) keeps all keys in
> RAM.
> >> The OOM killer strikes when Riak uses too much RAM, which is not good
> for my
> >> sanity or sleep. Because of the amount of growth I am anticipating, it
> seems
> >> unlikely that I can allocate enough RAM to keep up with the load. Disk,
> on
> >> the other hand, is less constrained.
> >>
> >> A little background on the data set: I have a sparsely accessed key set.
> >> By that I mean after a key is written, the more time passes with that
> key
> >> not being accessed, the less likely it is to be accessed any time soon.
> At
> >> any given time, most keys will be dormant. However, any given key
> _could_ be
> >> accessed at any time, so should be possible to retrieve it.
> >>
> >> I am currently running a smaller cluster (with smaller nodes: less RAM,
> >> smaller disks) than I expect to use eventually. I am starting to hit
> some
> >> growth-related issues that are prompting me to explore more options
> before
> >> it becomes a dire situation.
> >>
> >> My question: Are there ways to tune Riak (CS) to support this scenario
> >> gracefully? That is, are there ways to make Riak not load all keys into
> RAM?
> >> It looks like leveldb is just what I want, but I'm a little nervous
> >> switching over to only leveldb when the default/recommended config uses
> the
> >> multi backend.
> >>
> >> As a stop-gap measure, I enabled swap (with swappiness = 0), which I
> >> anticipated would kill performance, but was pleasantly surprised to see
> it
> >> return to effectively no-swap performance levels after a short period of
> >> lower performance. I'm guessing this is not a good long-term solution
> as my
> >> dataset grows. The problem with using large amounts of swap is that each
> >> time Riak starts it needs to read all keys into RAM. Long term, as our
> >> dataset grows, the amount of time needed to read keys into RAM will
> cause a
> >> very long restart time (and thus period of unavailability), which could
> >> endanger availability for a prolonged period if multiple nodes go down
> at
> >> once.
> >>
> >> Thanks!
> >> Daniel Miller
> >> Dimagi, Inc.
> >>
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20161122/cb55a8eb/attachment-0002.html>


More information about the riak-users mailing list