Riak CS: avoiding RAM overflow and OOM killer

Daniel Miller dmiller at dimagi.com
Mon Nov 21 15:45:13 EST 2016


I found a similar question from over a year ago (
http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-July/017327.html),
and it sounds like leveldb is the way to go, although possibly not well
tested. Has anything changed with regard to Basho's (or anyone else)
experience with using leveldb backend instead of the mutli backend for CS?

On Fri, Nov 4, 2016 at 11:48 AM, Daniel Miller <dmiller at dimagi.com> wrote:

> Hi,
>
> I have a Riak CS cluster up and running, and am anticipating exponential
> growth in the number of key/value pairs over the next few years. From
> reading the documentation and experience, I've concluded that the default
> configuration of CS (with riak_cs_kv_multi_backend) keeps all keys in RAM.
> The OOM killer strikes when Riak uses too much RAM, which is not good for
> my sanity or sleep. Because of the amount of growth I am anticipating, it
> seems unlikely that I can allocate enough RAM to keep up with the load.
> Disk, on the other hand, is less constrained.
>
> A little background on the data set: I have a sparsely accessed key set.
> By that I mean after a key is written, the more time passes with that key
> not being accessed, the less likely it is to be accessed any time soon. At
> any given time, most keys will be dormant. However, any given key *_could*_
> be accessed at any time, so should be possible to retrieve it.
>
> I am currently running a smaller cluster (with smaller nodes: less RAM,
> smaller disks) than I expect to use eventually. I am starting to hit some
> growth-related issues that are prompting me to explore more options before
> it becomes a dire situation.
>
> My question: Are there ways to tune Riak (CS) to support this scenario
> gracefully? That is, are there ways to make Riak not load all keys into
> RAM? It looks like leveldb is just what I want, but I'm a little nervous
> switching over to only leveldb when the default/recommended config uses the
> multi backend.
>
> As a stop-gap measure, I enabled swap (with swappiness = 0), which I
> anticipated would kill performance, but was pleasantly surprised to see it
> return to effectively no-swap performance levels after a short period of
> lower performance. I'm guessing this is not a good long-term solution as my
> dataset grows. The problem with using large amounts of swap is that each
> time Riak starts it needs to read all keys into RAM. Long term, as our
> dataset grows, the amount of time needed to read keys into RAM will cause a
> very long restart time (and thus period of unavailability), which could
> endanger availability for a prolonged period if multiple nodes go down at
> once.
>
> Thanks!
> Daniel Miller
> Dimagi, Inc.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20161121/d9f74136/attachment-0002.html>


More information about the riak-users mailing list