Load question

Runar Jordahl runar.jordahl at gmail.com
Tue Apr 12 03:22:31 EDT 2011


The questions you raise are important. I would add that for some
scenarios, processing your data locally (not using Riak, but your own
client program) could improve performance. In such a setup, each box
would run both Riak and your own software.

The Dynamo paper discusses data locality, and points at two strategies:
“(…) (1) route its request through a generic load balancer that will
select a node based on load information, or (2) use a partition-aware
client library that routes requests directly to the appropriate
coordinator nodes. The advantage of the first approach is that the
client does not have to link any code specific to Dynamo in its
application, whereas the second strategy can achieve lower latency
because it skips a potential forwarding step.”
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

So far, I have not seen any Riak client library using strategy (2).
What I have seen is a lot of discussion about using (generic) load
balancing (1). I am in the process of writing a client library myself,
but the library only supports specifying an IP address / host name to
contact.

It would be helpful if a wiki page (under Best Practices) was created
to discuss various load balance configurations. I am also wondering if
a Riak client could use strategy (2), like Dynamo clients can.

Kind regards
Runar Jordahl
http://blog.epigent.com/




More information about the riak-users mailing list