Riak 1.4.2 10G Ethernet Performance Problems

Evan Vigil-McClanahan emcclanahan at basho.com
Wed Jun 18 20:32:35 EDT 2014

Hi Earl,

There are some known internode bottlenecks in riak 1.4.x.  We've
addressed some of them in 2.0, but others likely remain.  If you're
willing to run some code at the console, running the following at the
console (from `riak attach`) should tell you whether or not the 2.0
changes are likely to help you.  I am not sure when 2.0 ready versions
of CS are slated for, however.

[inet:setopts(Port, [{sndbuf, 393216}, {recbuf, 786432}])
  || {_Node, Port} <- erlang:system_info(dist_ctrl)].

or to run this on all nodes (which you'll have to do to see if it helps):

FF = fun() ->
                  [inet:setopts(Port, [{sndbuf, 393216}, {recbuf, 786432}])
                    || {_Node, Port} <- erlang:system_info(dist_ctrl)]
rpc:multicall(erlang, apply, [FF, []]).

You should not run any of this on production machines without
extensive testing first.  Also if you have huge objects, like in a CS
cluster, it may help to increase the buffer sizes somewhat.

Note that increasing +zdbbl in your vm.args can also help somewhat, if
it isn't already prohibitively large.

Hope that this helps.  Let us know what you find.


On Wed, Jun 18, 2014 at 4:57 PM, Earl Ruby <earl_ruby at xyratex.com> wrote:
> Chris Read:
> Back in 2013 you reported a performance problem with Riak 1.4.2 running on a
> 10GbE network where Riak would never hit speeds faster than 2.5Gbps on the
> network.
> I'm seeing the same thing with Riak 1.4.2 and RiakCS. I've followed all of
> the tuning suggestions, my MTU is set to 9000 on the ethernet interfaces, I
> have one 10GbE network just for the backend inter-node data and one 10GbE
> "public" network where RiakCS listens for connections and which basho_bench
> uses to generate the load. I have 1-4 client systems on the public side
> running basho_bench and no matter how much traffic I generate with
> basho_bench I never see more than 3Gbits/s on the network. (It doesn't seem
> to matter if I run 1 or 4 clients, each with 200 concurrent sessions, the
> network data rate is about the same.) I'm running jnettop in two different
> windows during the tests to watch the aggregate network traffic on the
> private inter-node data network and the "public" basho_bench
> traffic-generating network.
> I've tested the network with iperf3 and it shows 9.92Gbits/s throughput with
> a TCP maximum segment size of 9000.
> I've tested the filesystems on each of the 6 Riak nodes using fio, and I can
> write to the filesystems at ~12.8Gbits/s, so the filesystem is not the
> bottleneck. Each node has 128GB RAM and is running the bitcask backend. The
> servers are mostly idle.
> I tried Sean's solution of increasing these values to:
> {riak_core, [
>     {handoff_batch_threshold, 4194304},
>     {handoff_concurrency, 10} ]}
> ... as described in
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-October/013787.html,
> but that had no effect.
> With my current hardware I'd expect that the 10GbE network would be the
> bottleneck, and I'd expect write speeds to top out at the top end of the
> network speed.
> There was no follow-up message on the mailing list to indicate how or if
> you'd solved the problem. Did you find a solution?
> (Please direct replies to the mailing list.)
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

More information about the riak-users mailing list