Scaling Riak CS to hundreds or thousands of servers

John Daily jdaily at basho.com
Thu Jul 25 10:56:51 EDT 2013


Good questions, Andre, thanks for reaching out.

> How is the Scaling of Raik CS when setting up a few hundred servers in a cluster?

The upper limit of a cluster size depends on traffic load, hardware, and network capacity, but anything approaching 100 servers is likely to run into trouble due to inherent limitations in both Riak and the Erlang VM itself.  An Erlang cluster is a full mesh, so the cluster overhead grows significantly with the number of servers.


> 
> Does it make sense, to build a cluster of that size, or is it recommended to have smaller pools of clusters and shard the Files over more than one cluster?

Definitely should look at deploying multiple clusters.


> What about stanchion as the potetial single point of failure, how to prevent it from corupting the System? If I got it right, Stanchion handles the IDs of Buckets and some user stuff. So if no further Buckets or users need to be created, there is no need for stanchion at that time and up-/downloading can go on as usual?

You're correct: user accounts and user buckets are the reason we need a consensus system, and thus the reason Stanchion exists. For existing users/buckets, as you indicate, files can be transferred without any involvement from Stanchion.

It is possible to cluster Stanchion using traditional cluster tools.

(Why Amazon chose to use a global namespace for buckets is a mystery beyond my ken.)

> 
> As I want to use riak cs for large files (50MB-15GB) is it possible to raise the chunk size up to these Filesizes, to prevent the system from heavy network traffic?

Definitely not. Erlang's distribution protocol will behave badly with objects that large.

-John Daily
Technical Evangelist
Basho Technologies





More information about the riak-users mailing list