High CPU usage followed by a node crash in riak 2.1.1

Zuzana Zatrochova zatrochova at gmail.com
Mon Jan 11 03:14:56 EST 2016


Hi,

I have a cluster of 5 riak2.1.1 nodes, each on separate virtual machine. I
run an experiment with load of 20000 requests per minute. In the experiment
a separate java application contacts erlang interceptor application through
RabbitMQ broker that runs on erlang version 15. (riak 2.1.1 runs on basho
patched version of erlang 16.) The interceptor application then uses riak
client to send requests to riak node. After 5 minutes of an experiment a
single random riak node crashes due to the 100% CPU usage. However, I can
observe 100 % CPU usage only on a single virtual machine with riak node,
other riak nodes have CPU usage of 40-60 %. Interestingly, if I run the
same experiments on virtual machines with riak-1.4.8 version they run fine.
(riak 1.4.8 runs on same erlang version as rabbitmq server - 15) . Any idea
what could cause my problems?

I also tried to build rabbitmq, riak.2.1.1 and my interceptor application
with erlang version R16B (built rabbitmq with these instructions
http://blog.eriksen.com.br/en/how-install-rabbitmq-latest-erlang-release-debian
) and the problem remained so I don’t think my problem is caused by
different erlang versions.

I can also observe that there are 900-1100 processes on each node on
average from output of

Processes = [erlang:process_info(Pid) || Pid <- erlang:processes()],

NumberOfProcesses = length(Processes),

But on the overloaded node, number of processes increases up to ~ 1500 and
then node crashes. Following the node crash, there are only 4 nodes in the
cluster. Again one of them starts to be overloaded and crashes after a
while.

Thanks for any help,

Regards,
Zuzana.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160111/99af0246/attachment-0002.html>


More information about the riak-users mailing list