Connection Pool with Erlang PB Client Necessary?

Andrew Berman rexxe98 at gmail.com
Tue Jul 26 14:35:21 EDT 2011


Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
Erlang and wasn't sure if using a gen_server solved some of the issues
with connections.  From what I've seen a lot of people simply make
calls to Riak directly from a resource and so I thought having a
gen_server in front of Riak would help to manage things better.
Apparently it doesn't.

So, then, two more questions.  I have used connection pools in Java
like C3P0 and they can ramp up connections and then cull connections
when there is a period of inactivity.  The only pooler I've found that
does this is: https://github.com/seth/pooler .  Do you have any other
recommendations on connection poolers?

Second, I'm still a little confused on client ID.  I thought client Id
represented an actual client, not a connection.  So, in my case, the
gen_server is one client which makes multiple connections.  After
seeing what you wrote and reading a bit more on it, it seems like
client Id should just be some random string (base64 encoded) that
should be generated on creating a connection.  Is that right?

Thanks for your help!

Andrew

On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <bos at mailrank.com> wrote:
> On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <rexxe98 at gmail.com> wrote:
>>
>> I know that this subject has been brought up before, but I'm still
>> wondering what the value of a connection pool is with Riak.
>
> It's a big deal:
>
> It amortises TCP and PBC connection setup overhead over a number of
> requests, thereby reducing average query latency.
> It greatly reduces the likelihood that very busy clients and servers will
> run out of limited resources that are effectively invisible, e.g. closed TCP
> connections stuck in TIME_WAIT.
>
> Each of the above is a pretty big deal. Of course, connection pooling isn't
> free.
>
> If you have many clients talking to a server sporadically, you may end up
> with large numbers of open-and-idle connections on a server, which will both
> consume resources and increase latency for all other clients. This is
> usually only a problem with a very large number (many thousands) of clients
> per server, and it usually only arises with poorly written and tuned
> connection pooling libraries. But ...
> ... Most connection pooling libraries are poorly written and tuned, so
> they'll behave pathologically just when you need them not to.
> Since you don't set up a connection per request, the requests where you *do*
> need to set up a connection are going to be more expensive than those where
> you don't, so you'll see jitter in your latency profile. About 99.9% of
> users will never, ever care about this.
>>
>> Since Erlang processes are so small and fast to
>> create, is there really any overhead in having the gen_server create a
>> new connection (with the same client id) each time it needs to access
>> Riak?
>
> Of course. The overhead of Erlang processes has nothing to do with the cost
> of setting up a connection.
> Also, you really don't want to be using the same client ID repeatedly across
> different connections. That's an awesome way to cause bugs with vclock
> resolution that end up being very very hard to diagnose.




More information about the riak-users mailing list