erlang pb client pooling + test results
dweldon at gmail.com
Thu Mar 31 19:32:40 EDT 2011
I recently tried to use the erlang pb client under moderate load (100
qps) and found it failed pretty hard because I was creating a new
connection for each query. While I'm not new to riak, I have been
using it under fairly light load up until now so I went searching for
a solution to the problem. The common answer was: use a connection
pool. This wasn't talked about on the wiki, so I checked this forum
and found a few connection pool implementations. Most implementations
didn't guarantee anything about simultaneous client usage, most didn't
compile, etc. So I went about writing my own and doing a few
experiments. Below are my findings. Please comment, check out the
code, and feel free to tell me where/if I went wrong. :)
Under moderate load, creating a new connection per request works fine,
provided that you call riakc_pb_socket:stop/1 when you are done with
it. I also created a client pooling application that seems to work but
is not in production yet.
I created a test which queries a local riak DB (ets backend) at a rate
of 500 qps for 10 seconds. Each query does some random work (80%
chance of a put/update, 20% chance of a delete) on a finite key-space.
After each job is completed the calling thread sleeps for 100 ms
before completing. Riak was restarted after each test. I tested three
for each query...
1) call to riakc_pb_socket:start_link/2 without a subsequent call to
2) call to riakc_pb_socket:start_link/2 with a subsequent call to
3) use riakpool (my pooling application)
Riakpool maintains a queue of connection pids as the state in a
gen_server. Pids are checked in and out by calling clients so as to
prevent simultaneous use. If no connections exist in the queue, a new
one is created. All connections are supervised by a simple_one_for_one
supervisor. Pids are explicitly checked for liveness before being
You can find the project here:
1) Fails almost immediately. I assume this is because the max number
of file handles gets reached.
2) Works without error.
3) Works without error. At the end of the test, there were roughly 50
open connections which is what we expect from little's law.
Bryan's statement here:
seems to be incorrect, although it was probably made prior to the PB
client. Apart from potential vector clock bloat as a result of
changing client ids as mentioned here:
I'm unsure why pooling is really even necessary (at least at the
tested level of load). I think Basho should have some official
position on this on the wiki.
Comments and suggestions are very welcome!
More information about the riak-users