Interesting sawtooth increasing CPU usage on lightly-used Riak cluster on EC2 micro instances. Is this expected?

Eamonn eamonn.obrien-strain at hp.com
Mon May 30 13:52:12 EDT 2011


I have a cluster of six Riak nodes that has been operating for a few 
months on Amazon EC2.   Because this is a development deployment with 
very light usage currently, I have used cheap "micro" instances.

I am using  riak_0.14.0-1_amd64.deb with no changes to the default 
app.config except to modify the IP addresses.

If you look at the graph of CPU usage for these instances over the last 
two weeks
   http://eamonn.org/riak/riak-cluster-cpu.png
you see an interesting pattern.

Each node gradually increases its CPU over about two days and then 
suddenly drops down slightly, forming a saw-tooth pattern.  From an 
initial low average CPU several months ago, the CPU usage has now slowly 
risen.   It now seems to have reached an equilibrium, with three of the 
nodes at 50% and three at 60%.

Most of this usage happens when the cluster is not being used 
externally.  The little spikes you see on the graph are probably the 
actual external access via the REST API.

I assume this activity is caused by the continuous "gossip" between the 
nodes.  Perhaps the different equilibrium CPU percentages are related to 
which share of the data items each node has.  Is the sawtooth pattern 
showing some kind of garbage collection?

The Riak cluster does seem to work correctly with reasonable latency, at 
least under low load.  (I have not yet done load-testing).

Is this pattern expected?   Is it a sign of some problem with my 
configuration?  Any suggestions for how to tune the cluster to run 
better on EC2 micro instances?  Any suggestions of what metrics to use 
to decide when to dynamically scale the cluster by spinning up nodes or 
spinning them down?

Thanks,
__
Eamonn O'Brien-Strain
HP Labs




More information about the riak-users mailing list