Have Riak servers in separate cluster behind a load balancer, or on same machines as web server?

O'Brien-Strain, Eamonn eamonn.obrien-strain at hp.com
Tue Oct 4 17:04:38 EDT 2011

I am contemplating two different architectures for deploying Riak nodes and web servers.

Option A:  Riak nodes are in their own cluster of dedicated machines behind a load balancer.  Web servers talk to the Riak nodes via the load balancer. (See diagram http://eamonn.org/i/riak-arch-A.png )

Option B: Each web server machine also has a Riak node, and there are also some Riak-only machines.  Each web server only talks to its own localhost Riak node. (See diagram http://eamonn.org/i/riak-arch-B.png )

All machines will deployed as elastic cloud instances.  I will want to spin up and spin down instances, particularly the web servers, as demand varies.  Both load balancers are non-sticky.  Web servers are currently talking to Riak via HTTP (though might change that to protocol buffers in the future).  Currently Riak is configured with the default options.

Here is my thinking of the comparative advantages:

Option A:

 - Better for security, because can lock down the Riak load balancer to only open a single port and only for connections from the web servers.
 - Less churn for Riak of nodes entering and leaving the Riak cluster (as web servers spin up and down)
 - More flexibility in scaling storage and web tiers independently of each other

Option B:

 - Faster localhost connection from web server to Riak

I think availability is similar for the two options.

The web server response time is the primary metric I want to optimize.  Most web server requests will cause several requests to Riak.

What other factors should I take into account?  What measurements could I make to help me decide between the architectures?  Are there other architectures I should consider? Should I add memcached? Does anyone have any experiences they could share in deploying such systems?


More information about the riak-users mailing list