Connection Pool with Erlang PB Client Necessary?

Andrew Berman rexxe98 at gmail.com
Thu Jul 28 17:32:01 EDT 2011


Cool, I'll check it out, though there appears to be something wrong
with your account as when I try to view the source, I get an error
back from GitHub.

On Thu, Jul 28, 2011 at 1:55 PM, Joel Meyer <joel.meyer at gmail.com> wrote:
>
>
> On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <rexxe98 at gmail.com> wrote:
>>
>> Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
>> Erlang and wasn't sure if using a gen_server solved some of the issues
>> with connections.  From what I've seen a lot of people simply make
>> calls to Riak directly from a resource and so I thought having a
>> gen_server in front of Riak would help to manage things better.
>> Apparently it doesn't.
>>
>> So, then, two more questions.  I have used connection pools in Java
>> like C3P0 and they can ramp up connections and then cull connections
>> when there is a period of inactivity.  The only pooler I've found that
>> does this is: https://github.com/seth/pooler .  Do you have any other
>> recommendations on connection poolers?
>
> I'm late to the party, but you could take a look at gen_server_pool
> (https://github.com/openx/gen_server_pool). It's a pooling library I wrote
> to provide pooling of gen_servers. I've used it mostly for Thrift clients,
> but Anthony (also on the list) uses it to pool riak_pb clients in
> webmachine. The basic idea is that you'd call
> gen_server_pool:start_link(...) wherever you'd normally call
> gen_server:start_link(...) and pass in a few extra args that control min and
> max pool size, as well as idle timeout. You can use the Pid you get back
> from that the same way you'd use the pid of your gen_server, except that all
> work gets dispatched to a member of a pool instead of a single gen_server.
> To be honest, I haven't tested out the open-source version I posted on
> GitHub (sorry, I've been busy), but it's just a slightly modified version of
> the internal library that's been used in production for several months with
> good results.
> Cheers,
> Joel
>
>>
>> Second, I'm still a little confused on client ID.  I thought client Id
>> represented an actual client, not a connection.  So, in my case, the
>> gen_server is one client which makes multiple connections.  After
>> seeing what you wrote and reading a bit more on it, it seems like
>> client Id should just be some random string (base64 encoded) that
>> should be generated on creating a connection.  Is that right?
>>
>> Thanks for your help!
>>
>> Andrew
>>
>> On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <bos at mailrank.com>
>> wrote:
>> > On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <rexxe98 at gmail.com>
>> > wrote:
>> >>
>> >> I know that this subject has been brought up before, but I'm still
>> >> wondering what the value of a connection pool is with Riak.
>> >
>> > It's a big deal:
>> >
>> > It amortises TCP and PBC connection setup overhead over a number of
>> > requests, thereby reducing average query latency.
>> > It greatly reduces the likelihood that very busy clients and servers
>> > will
>> > run out of limited resources that are effectively invisible, e.g. closed
>> > TCP
>> > connections stuck in TIME_WAIT.
>> >
>> > Each of the above is a pretty big deal. Of course, connection pooling
>> > isn't
>> > free.
>> >
>> > If you have many clients talking to a server sporadically, you may end
>> > up
>> > with large numbers of open-and-idle connections on a server, which will
>> > both
>> > consume resources and increase latency for all other clients. This is
>> > usually only a problem with a very large number (many thousands) of
>> > clients
>> > per server, and it usually only arises with poorly written and tuned
>> > connection pooling libraries. But ...
>> > ... Most connection pooling libraries are poorly written and tuned, so
>> > they'll behave pathologically just when you need them not to.
>> > Since you don't set up a connection per request, the requests where you
>> > *do*
>> > need to set up a connection are going to be more expensive than those
>> > where
>> > you don't, so you'll see jitter in your latency profile. About 99.9% of
>> > users will never, ever care about this.
>> >>
>> >> Since Erlang processes are so small and fast to
>> >> create, is there really any overhead in having the gen_server create a
>> >> new connection (with the same client id) each time it needs to access
>> >> Riak?
>> >
>> > Of course. The overhead of Erlang processes has nothing to do with the
>> > cost
>> > of setting up a connection.
>> > Also, you really don't want to be using the same client ID repeatedly
>> > across
>> > different connections. That's an awesome way to cause bugs with vclock
>> > resolution that end up being very very hard to diagnose.
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>




More information about the riak-users mailing list