Reusing Riak cluster as API cluster on Amazon EC2

Mark Phillips mark at
Tue Apr 2 14:51:12 EDT 2013

Hi Dev,

This approach *should* work, but a few caveats to be aware of before you go
this route:

* You're relying on the OS to handle to the CPU/IO task sharing for the two
different services
* Related to above: depending on the cluster load/task, you could have
resource contention that could result in end-user issues
* You're coupling scaling out the API servers with Riak servers, meaning
you might need to add more CPU for API server-related tasks but Riak won't
need this
* If a AWS VM crashes (this never happens), you're losing both an API
server and a Riak node.

On a general note, ELB is indeed a finicky beast, and if you can avoid it,
you'll probably be better off. You might investigate using local haproxy if
you're of a mind.

So, the best approach might be to set up a test cluster the API +
Riak approach and test some scenarios. :)

Hope that helps.


On Sun, Mar 31, 2013 at 8:36 AM, Dev Vasantharajan <dev at>wrote:

> We are looking to deploy a Riak cluster of 5 machines using the Bitcask
> backend. We're storing almost all user-specific data in Riak, as well as
> session data which will be validated on each request.
> Generally speaking, each API request will result in either 4-5 writes to
> Riak or 4-5 reads from Riak, and the net read:write ratio on Riak is
> expected to be around 3:1.
> We are expecting a small number of users initially (say a few 1000), but
> all the components (including Riak) have been chosen to allow eventual
> linear scaling to several more.
> Given this low initial footprint (which will last for a couple of months
> at least), we WOULD however like to deploy a small number of machines
> initially so as to control costs.
> Given that almost all API calls are so heavily dependent on Riak, the
> thinking in our little group is to host the API servers (nginx, uwsgi and
> Python) on the same EC2 instances as the Riak servers. This would (so is
> the hope) let uwsgi and Riak utilize as many of the cores on Amazon as
> required, efficiently, and allow the utilization on each individual
> instance to be maximized, before we scale "out" to additional instances.
> The older design, was to have both the API cluster and Riak cluster behind
> their own individual ELBs, and scale both independently, automatically
> using Amazon's auto-scaling. The API  servers would talk to Riak through
> the ELB. This would have worked too, but the latency from the API servers
> to Riak via ELB could have been an issue. When we have both API servers and
> Riak on the same instances *instead*, the Riak read/write requests would
> go to "localhost", with Riak gossip doing the rest of the work, thus
> cutting out the ELB and its latency completely (from the internal network
> logic).
> Overall, this seems like a win-win. We get reasonably low-latency with
> localhost Riak, we get maximal utilization of Amazon EC2 instances even
> with few users, and scale out reasonably well when additional users come on
> board.
> *BUT, are there any caveats we are ignoring here?* All of this just seems
> a little good to be true, and I want to make sure we are taking care of as
> many variables as possible.
> Thanks!
> Regards.
> Dev
> _______________________________________________
> riak-users mailing list
> riak-users at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list