Connection Pool with Erlang PB Client Necessary?

Joel Meyer joel.meyer at gmail.com
Thu Jul 28 16:55:45 EDT 2011


On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <rexxe98 at gmail.com> wrote:

> Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
> Erlang and wasn't sure if using a gen_server solved some of the issues
> with connections.  From what I've seen a lot of people simply make
> calls to Riak directly from a resource and so I thought having a
> gen_server in front of Riak would help to manage things better.
> Apparently it doesn't.
>
> So, then, two more questions.  I have used connection pools in Java
> like C3P0 and they can ramp up connections and then cull connections
> when there is a period of inactivity.  The only pooler I've found that
> does this is: https://github.com/seth/pooler .  Do you have any other
> recommendations on connection poolers?
>

I'm late to the party, but you could take a look at gen_server_pool (
https://github.com/openx/gen_server_pool). It's a pooling library I wrote to
provide pooling of gen_servers. I've used it mostly for Thrift clients, but
Anthony (also on the list) uses it to pool riak_pb clients in webmachine.
The basic idea is that you'd call gen_server_pool:start_link(...) wherever
you'd normally call gen_server:start_link(...) and pass in a few extra args
that control min and max pool size, as well as idle timeout. You can use the
Pid you get back from that the same way you'd use the pid of your
gen_server, except that all work gets dispatched to a member of a pool
instead of a single gen_server. To be honest, I haven't tested out the
open-source version I posted on GitHub (sorry, I've been busy), but it's
just a slightly modified version of the internal library that's been used in
production for several months with good results.

Cheers,
Joel


>
> Second, I'm still a little confused on client ID.  I thought client Id
> represented an actual client, not a connection.  So, in my case, the
> gen_server is one client which makes multiple connections.  After
> seeing what you wrote and reading a bit more on it, it seems like
> client Id should just be some random string (base64 encoded) that
> should be generated on creating a connection.  Is that right?
>
> Thanks for your help!
>
> Andrew
>
> On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <bos at mailrank.com>
> wrote:
> > On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <rexxe98 at gmail.com>
> wrote:
> >>
> >> I know that this subject has been brought up before, but I'm still
> >> wondering what the value of a connection pool is with Riak.
> >
> > It's a big deal:
> >
> > It amortises TCP and PBC connection setup overhead over a number of
> > requests, thereby reducing average query latency.
> > It greatly reduces the likelihood that very busy clients and servers will
> > run out of limited resources that are effectively invisible, e.g. closed
> TCP
> > connections stuck in TIME_WAIT.
> >
> > Each of the above is a pretty big deal. Of course, connection pooling
> isn't
> > free.
> >
> > If you have many clients talking to a server sporadically, you may end up
> > with large numbers of open-and-idle connections on a server, which will
> both
> > consume resources and increase latency for all other clients. This is
> > usually only a problem with a very large number (many thousands) of
> clients
> > per server, and it usually only arises with poorly written and tuned
> > connection pooling libraries. But ...
> > ... Most connection pooling libraries are poorly written and tuned, so
> > they'll behave pathologically just when you need them not to.
> > Since you don't set up a connection per request, the requests where you
> *do*
> > need to set up a connection are going to be more expensive than those
> where
> > you don't, so you'll see jitter in your latency profile. About 99.9% of
> > users will never, ever care about this.
> >>
> >> Since Erlang processes are so small and fast to
> >> create, is there really any overhead in having the gen_server create a
> >> new connection (with the same client id) each time it needs to access
> >> Riak?
> >
> > Of course. The overhead of Erlang processes has nothing to do with the
> cost
> > of setting up a connection.
> > Also, you really don't want to be using the same client ID repeatedly
> across
> > different connections. That's an awesome way to cause bugs with vclock
> > resolution that end up being very very hard to diagnose.
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110728/574982d5/attachment.html>


More information about the riak-users mailing list