Riak performance problems when LevelDB database grows beyond 16GB

Evan Vigil-McClanahan emcclanahan at basho.com
Thu Oct 18 18:32:54 EDT 2012


Ahoj, Jan,

Our leveldb developer, Matthew, sent along a reply.  Reading it and
reading your last reply, I am at the limits of my ability to suggest
things, other than to note that if you're IO bound, running the disks
in RAID 0 rather than RAID 1 may help.

Please contact me off list if you have any issues getting those files
to Matthew because of their size.

On Wed, Oct 17, 2012 at 5:59 AM,  <Jan.Evangelista at seznam.cz> wrote:
> Hi Evan,
>
> I corrected the setup according to your recommendations:
>
> - vm.swappiness is 0
> - fs is ext4 on software RAID1, mounted with noatime
> - disk scheduler is set to deadline (it was the default)
> - eleveldb max_open_files is set to 200, cache is set to default
>
> (BTW, why is Riak not using the new O_NOATIME open(2) flag?)
>
> I restarted the last test with 3x40G and 1x14G DB, and it was able to sustain 1000 ops/sec for 5 minutes. Then node5 stalled with the call stack described in the original mail, 1 of 4 cores almost 100% busy. The node did write 29 M/s (140 IOPs), with an occasional read (<5 IOPs), with 252 open LevelDB files. The disk has 869G of free space.
>
> When I looked at the performance graphs 17 hours later, it still did write at cca 29M/s (120 IOPs), with the same call stack. The Riak node was  busy even after 17 hours without any application requests, and it was not even connected to the rest of the Riak cluster (the node was not listed by erlang:nodes() on other nodes). I would suspect a bug in LevelDB, but people are using it in production, aren't they?
>
> I intend to retry the test without the software RAID. Any other hints?
>
> Best regards, Jan
>
> ---------- Původní zpráva ----------
> Od: Evan Vigil-McClanahan
> Datum: 12. 10. 2012
> Předmět: Re: Re: Riak performance problems when LevelDB database grows beyond 16GB
> Hi there, Jan,
>
> The lsof issue is that max_open_files is per backend, iirc, so if
> you're maxed out you'll see vnode count * max_open_files.
>
> I think on the second try, you may have set the cache too high.   I'd
> drop it back to 8 or 16 MB, and possibly up the open files a bit more,
> but you don't seem to be running into contention at this point.
> There's a RAM cost, so maybe just leave it where it is for now, unless
> you have quite a lot of memory.
>
> Another thing to check is that vm.swappiness is set to 0 and that your
> disk scheduler is set to deadline for spinning disks and noop for
> SSDs.
>
> On Fri, Oct 12, 2012 at 5:02 AM,   wrote:
>>> Can you attach the eleveldb portion of your app.config file?
>>> Configuration problems, especially max_open_files being too low, can
>>> often cause issues like this.
>>>
>>> If it isn't sensitive, the whole app.config and vm.args files are also
>>> often helpful.
>>
>> Hello Evan,
>>
>> thanks for responding.
>>
>> I originally had default LevelDB settings. When the node stalled, I changed it
>>  to
>>
>>  {eleveldb, [
>>              {data_root, "/home/riak/leveldb"},
>>              {max_open_files, 132},
>>              {cache_size, 377487360}
>>             ]},
>>
>> on all nodes and I restarted them all. The application started to run with
>> about 1000 requests/second, after about 1 minute it dropped to <500
>> requests/second, and the node stalled again after 41 minutes. BTW according to
>>  lsof(1) it had 267 open LevelDB files which is more than the 132 files limit
>> (??).




More information about the riak-users mailing list