Recovering Riak data if it can no longer load in memory
vikramlalit at gmail.com
Wed Jul 13 14:22:41 EDT 2016
Many thanks again, Matthew... this is very educative and helpful...! I'm
now giving a shot with higher performance nodes...
On Tue, Jul 12, 2016 at 4:18 PM, Matthew Von-Maszewski <matthewv at basho.com>
> You can further reduce memory used by leveldb with the following setting
> in riak.conf:
> leveldb.threads = 5
> The value "5" needs to be a prime number. The system defaults to 71.
> Many Linux implementations will allocate 8Mbytes per thread for stack. So
> bunches of threads lead to bunches of memory reserved for stack. That is
> fine on servers with higher memory. But probably part of your problem on a
> small memory machine.
> The thread count is high to promote parallelism across vnodes on the same
> server, especially with "entropy = active". So again, this setting is
> sacrificing performance to save memory.
> P.S. You really want 8 CPU cores, 4 as a dirt minimum. And review this
> for more cpu performance info:
> On Jul 12, 2016, at 4:04 PM, Vikram Lalit <vikramlalit at gmail.com> wrote:
> Thanks much Matthew. Yes the server is low-memory given only development
> right now - I'm using an AWS micro instance, so 1 GB RAM and 1 vCPU.
> Thanks for the tip - let me try move the manifest file to a larger
> instance and see how that works. More than reducing the memory footprint in
> dev, my concern was more around reacting to a possible production scenario
> where the db stops responding due to memory overload. Understood now that
> moving to a larger instance should be possible. Thanks again.
> On Tue, Jul 12, 2016 at 12:26 PM, Matthew Von-Maszewski <
> matthewv at basho.com> wrote:
>> It would be helpful if you described the physical characteristics of the
>> servers: memory size, logical cpu count, etc.
>> Google created leveldb to be highly reliable in the face of crashes. If
>> it is not restarting, that suggests to me that you have a low memory
>> condition that is not able to load leveldb's MANIFEST file. That is easily
>> fixed by moving the dataset to a machine with larger memory.
>> There is also a special flag to reduce Riak's leveldb memory foot print
>> during development work. The setting reduces the leveldb performance, but
>> lets you run with less memory.
>> In riak.conf, set:
>> leveldb.limited_developer_mem = true
>> > On Jul 12, 2016, at 11:56 AM, Vikram Lalit <vikramlalit at gmail.com>
>> > Hi - I've been testing a Riak cluster (of 3 nodes) with an ejabberd
>> messaging cluster in front of it that writes data to the Riak nodes. Whilst
>> load testing the platform (by creating 0.5 million ejabberd users via
>> Tsung), I found that the Riak nodes suddenly crashed. My question is how do
>> we recover from such a situation if it were to occur in production?
>> > To provide further context / details, the leveldb log files storing the
>> data suddenly became too huge, thus making the AWS Riak instances not able
>> to load them in memory anymore. So we get a core dump if 'riak start' is
>> fired on those instances. I had an n_val = 2, and all 3 nodes went down
>> almost simultaneously, so in such a scenario, we cannot even rely on a 2nd
>> copy of the data. One way to of course prevent it in the first place would
>> be to use auto-scaling, but I'm wondering is there a ex post facto / post
>> the event recovery that can be performed in such a scenario? Is it possible
>> to simply copy the leveldb data to a larger memory instance, or to curtail
>> the data further to allow loading in the same instance?
>> > Appreciate if you can provide inputs - a tad concerned as to how we
>> could recover from such a situation if it were to happen in production
>> (apart from leveraging auto-scaling as a preventive measure).
>> > Thanks!
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users