riak exiting when number partitions > 128

David Lowell dave at go2ctv.com
Mon Oct 8 22:02:09 EDT 2012


Thanks guys. I had raised the max open files limit at the system level for the 'riak' user to 100k, and confirmed that had taken affect with "sudo -u riak bash -c 'ulimit -a'". However, it appears that you are correct that this system setting is not affecting the actual startup of riak when I run riak's /etc/init.d/riak init script.

Digging further, reveals that 'su' seems not to be helping. Witness:

$ sudo -u riak bash -c 'ulimit -n'
100000
$ sudo su - riak -c 'ulimit -n'
1024

I've seen weirdness like this with su in the past, and have been phasing it out of my vocabulary.

So, to confirm this was the issue I tweaked riak's init script, replacing

   su - riak -c "$DAEMON $DAEMON_ARGS" || return 2
    
with

  sudo -u riak $DAEMON $DAEMON_ARGS || return 2 

And now riak starts up properly with 512 partitions.

So now the question becomes: is there some way to get "su" to behave more like "sudo" in this case? Or do we just need to use a custom init script until the stock init script evolves past su?

Dave

--
Dave Lowell
dave at connectv.com

On Oct 8, 2012, at 6:21 PM, Jeremiah Peschka wrote:

> Like Alex, I state ulimit -n directly in my start Riak start up scripts. For my local dev instance it looks like this:
> 
> ### Generic Riak dev version setup
> function riak_dev_start() {
>   local CURRENT=`pwd`;
>   
>   ulimit -n 1024;
> 
>   cd ~/Projects/riak/dev
>   echo "Starting riak node 1 on 127.0.0.1"
>   dev1/bin/riak start
>   echo "Starting riak node 2 on 127.0.0.1"
>   dev2/bin/riak start
>   echo "Starting riak node 3 on 127.0.0.1"
>   dev3/bin/riak start
>   echo "Starting riak node 4 on 127.0.0.1"
>   dev4/bin/riak start
>   
>   cd $CURRENT
> }
> 
> I'd definitely try bumping ulimit in your riak startup scripts themselves and see if that eliminates the issues that you're running into.
> ---
> Jeremiah Peschka
> Managing Director, Brent Ozar PLF, LLC
> 
> 
> On Mon, Oct 8, 2012 at 6:11 PM, Alexander Sicular <siculars at gmail.com> wrote:
> I don't think you're setting it correctly. I usually set it in the terminal before calling riak start. Or set it system wide, different ways to do it depending on your os. 
> 
> 
> @siculars
> http://siculars.posterous.com
> 
> Sent from my iRotaryPhone
> 
> On Oct 8, 2012, at 21:00, David Lowell <dave at go2ctv.com> wrote:
> 
>> I'm starting to want to move past the default Riak configs, for example, by running with a larger number of partitions than the default 64. However, today when bumping up the "ring_creation_size" config param to 256 or higher Riak started failing soon after startup with messages about "Too many open files". For the record, I'm using the ELevelDB back-end.
>> 
>> I've seen the documentation about the need for ring_creation_size * max_open_files file descriptors with levelDB. I've upped the system open files limit for the riak user to 100k, so I don't think I'm hitting that system limit. So it feels like I'm hitting a limit configured within the application somewhere.
>> 
>> It doesn't feel like changing levelDB's 'max_open_files' configuration is the issue here, as I'm using the default/minimum value of 20 for that parameter. Any other setting would increase open files.
>> 
>> So I could use a pointer here from folks who have been here. I suspect there is something very simple required here. 
>> 
>> Thanks folks!
>> 
>> Dave
>> 
>> ps. For the record, my data set is empty on this host, and for completeness I'm blowing away the ring state when I fiddle with the ring_creation_size parameter.
>> 
>> --
>> Dave Lowell
>> dave at connectv.com
>> 
>> 
>> 2012-10-09 00:50:17.430 [info] <0.7.0> Application riak_kv started on node 'riak at 10.0.3.81'
>> 2012-10-09 00:50:17.456 [info] <0.7.0> Application merge_index started on node 'riak at 10.0.3.81'
>> 2012-10-09 00:50:17.459 [info] <0.1316.0>@riak_core:wait_for_service:445 Waiting for service riak_kv to start (0 seconds)
>> 2012-10-09 00:50:17.525 [info] <0.1303.0>@riak_core:wait_for_application:419 Wait complete for application riak_kv (0 seconds)
>> 2012-10-09 00:50:37.366 [error] <0.5081.0>@riak_kv_vnode:init:265 Failed to start riak_kv_eleveldb_backend Reason: {db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"}
>> 2012-10-09 00:50:37.423 [notice] <0.5081.0>@riak:stop:46 "backend module failed to start."
>> 2012-10-09 00:50:37.424 [error] <0.5081.0> CRASH REPORT Process <0.5081.0> with 0 neighbours exited with reason: {db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"} in gen_fsm:init_it/6 line 371
>> 2012-10-09 00:50:37.429 [info] <0.494.0>@riak_kv_js_vm:terminate:240 Spidermonkey VM (pool: riak_kv_js_hook) host stopping (<0.494.0>)
>> 2012-10-09 00:50:37.673 [error] <0.138.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.5081.0> exit with reason {db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"} in context child_terminated
>> 2012-10-09 00:50:37.736 [error] <0.153.0> gen_server riak_core_vnode_manager terminated with reason: no match of right hand value {error,{db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"}} in riak_core_vnode_manager:get_vnode/3 line 489
>> 2012-10-09 00:50:37.799 [error] <0.153.0> CRASH REPORT Process riak_core_vnode_manager with 0 neighbours exited with reason: no match of right hand value {error,{db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"}} in riak_core_vnode_manager:get_vnode/3 line 489 in gen_server:terminate/6 line 747
>> 2012-10-09 00:50:37.844 [error] <0.136.0> Supervisor riak_core_sup had child riak_core_vnode_manager started with riak_core_vnode_manager:start_link() at <0.153.0> exit with reason no match of right hand value {error,{db_open,"IO error: /var/data/ctv/riak/leveldb/1427247692705959881058285969449495136382746624000/LOCK: Too many open files"}} in riak_core_vnode_manager:get_vnode/3 line 489 in context child_terminated
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121008/cd2e3afe/attachment.html>


More information about the riak-users mailing list