Riak CS: avoiding RAM overflow and OOM killer

Alexander Sicular siculars at gmail.com
Tue Nov 22 02:42:55 EST 2016


Hi Daniel,

How many nodes?
-You should be using 5 minimum if you using the default config. There
are reasons.

How much ram per node?
-As you noted, in Riak CS, 1MB file chunks are stored in bitcask.
Their key names and some overhead consume memory.

How many objects (files)? What is the average file size?
-If your size distribution significantly skews < 1MB that means you
will have a bunch of files in bitcask eating up ram.

Kota was a former Basho engineer who worked on CS... That said, Basho
may not support a non standard deployment.

-Alexander

On Mon, Nov 21, 2016 at 2:45 PM, Daniel Miller <dmiller at dimagi.com> wrote:
> I found a similar question from over a year ago
> (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-July/017327.html),
> and it sounds like leveldb is the way to go, although possibly not well
> tested. Has anything changed with regard to Basho's (or anyone else)
> experience with using leveldb backend instead of the mutli backend for CS?
>
> On Fri, Nov 4, 2016 at 11:48 AM, Daniel Miller <dmiller at dimagi.com> wrote:
>>
>> Hi,
>>
>> I have a Riak CS cluster up and running, and am anticipating exponential
>> growth in the number of key/value pairs over the next few years. From
>> reading the documentation and experience, I've concluded that the default
>> configuration of CS (with riak_cs_kv_multi_backend) keeps all keys in RAM.
>> The OOM killer strikes when Riak uses too much RAM, which is not good for my
>> sanity or sleep. Because of the amount of growth I am anticipating, it seems
>> unlikely that I can allocate enough RAM to keep up with the load. Disk, on
>> the other hand, is less constrained.
>>
>> A little background on the data set: I have a sparsely accessed key set.
>> By that I mean after a key is written, the more time passes with that key
>> not being accessed, the less likely it is to be accessed any time soon. At
>> any given time, most keys will be dormant. However, any given key _could_ be
>> accessed at any time, so should be possible to retrieve it.
>>
>> I am currently running a smaller cluster (with smaller nodes: less RAM,
>> smaller disks) than I expect to use eventually. I am starting to hit some
>> growth-related issues that are prompting me to explore more options before
>> it becomes a dire situation.
>>
>> My question: Are there ways to tune Riak (CS) to support this scenario
>> gracefully? That is, are there ways to make Riak not load all keys into RAM?
>> It looks like leveldb is just what I want, but I'm a little nervous
>> switching over to only leveldb when the default/recommended config uses the
>> multi backend.
>>
>> As a stop-gap measure, I enabled swap (with swappiness = 0), which I
>> anticipated would kill performance, but was pleasantly surprised to see it
>> return to effectively no-swap performance levels after a short period of
>> lower performance. I'm guessing this is not a good long-term solution as my
>> dataset grows. The problem with using large amounts of swap is that each
>> time Riak starts it needs to read all keys into RAM. Long term, as our
>> dataset grows, the amount of time needed to read keys into RAM will cause a
>> very long restart time (and thus period of unavailability), which could
>> endanger availability for a prolonged period if multiple nodes go down at
>> once.
>>
>> Thanks!
>> Daniel Miller
>> Dimagi, Inc.
>>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list