High CPU usage followed by a node crash in riak 2.1.1

Luke Bakken lbakken at basho.com
Thu Jan 14 16:12:43 EST 2016


Hi Zuzana,

A good place to start would be to look for errors in Riak's
log/error.log file. If you can run "riak-debug" on a crashed node and
share the generated file, that would be helpful in diagnosis.

I am assuming that your Java application and RabbitMQ server do *not*
run on the same servers as your Riak nodes.

--
Luke Bakken
Engineer
lbakken at basho.com


On Mon, Jan 11, 2016 at 12:14 AM, Zuzana Zatrochova
<zatrochova at gmail.com> wrote:
> Hi,
>
> I have a cluster of 5 riak2.1.1 nodes, each on separate virtual machine. I
> run an experiment with load of 20000 requests per minute. In the experiment
> a separate java application contacts erlang interceptor application through
> RabbitMQ broker that runs on erlang version 15. (riak 2.1.1 runs on basho
> patched version of erlang 16.) The interceptor application then uses riak
> client to send requests to riak node. After 5 minutes of an experiment a
> single random riak node crashes due to the 100% CPU usage. However, I can
> observe 100 % CPU usage only on a single virtual machine with riak node,
> other riak nodes have CPU usage of 40-60 %. Interestingly, if I run the same
> experiments on virtual machines with riak-1.4.8 version they run fine. (riak
> 1.4.8 runs on same erlang version as rabbitmq server - 15) . Any idea what
> could cause my problems?
>
> I also tried to build rabbitmq, riak.2.1.1 and my interceptor application
> with erlang version R16B (built rabbitmq with these instructions
> http://blog.eriksen.com.br/en/how-install-rabbitmq-latest-erlang-release-debian
> ) and the problem remained so I don’t think my problem is caused by
> different erlang versions.
>
> I can also observe that there are 900-1100 processes on each node on average
> from output of
>
> Processes = [erlang:process_info(Pid) || Pid <- erlang:processes()],
>
> NumberOfProcesses = length(Processes),
>
> But on the overloaded node, number of processes increases up to ~ 1500 and
> then node crashes. Following the node crash, there are only 4 nodes in the
> cluster. Again one of them starts to be overloaded and crashes after a
> while.
>
> Thanks for any help,
>
> Regards,
>
> Zuzana.
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list