Riak JAVA Client Performance

Pavel Kogan pavel.kogan at cortica.com
Sun Oct 14 01:22:12 EDT 2012


Thanks. It is very useful information.

Pavel

On Sun, Oct 14, 2012 at 6:48 AM, Brian Roach <roach at basho.com> wrote:

> Some points about how the Java client works:
>
> You use a single instance of the client and share it across threads.
>
> The client holds a connection pool. It grows as necessary. You can
> specify the starting size of the pool and the max size (default is
> unlimited). There is an idle reaper thread in that connection pool
> that evicts connections that are idle for 1 second by default; this is
> also something you can change in the config.
>
> As has been mentioned, the best way to deal with a Riak cluster is by
> using HAProxy. There is a ClusterClient available in the Java client
> that you can instantiate and use that will round-robin requests to
> different nodes. That said, unfortunately there are currently a number
> of issues that make HAProxy a superior solution. This is something we
> plan to address as we further develop the Java client.
>
> Thanks,
> Brian Roach
>
>
> On Thu, Oct 11, 2012 at 3:39 AM, Guido Medina <guido.medina at temetra.com>
> wrote:
> > Hi Pavel,
> >
> >   I'm not an expert with the pool size, but depending on your average key
> > size and nodes you could tune it to your needs, regarding the client, a
> > single shared client instance will suffice, there is a retrier parameter
> > which says how many times Riak will retry your operation before returning
> > you an exception (3 by default), and there is a timeout on acquiring the
> > connection, this is an example config:
> >
> > The pool size here is for 4 nodes cluster kind of guessing for Erlang 8
> > threads per node to allow Riak nodes do other things too, remember they
> have
> > to sync their data between the nodes:
> >
> > host = your balancer host
> > port = your balancer port
> >
> > final PBClientConfig clientConfig=new
> >
> PBClientConfig.Builder().withHost(host).withPort(port).withPoolSize(32).withConnectionTimeoutMillis(5000).build();
> > final IRiakClient riakClient=RiakFactory.newClient(clientConfig);
> >
> > That we have it running with no issues, the pool size depends on your
> needs
> > and data size, you could run with a pool size of 50 to a 100 if your keys
> > are really small, you will have to try your own values.
> >
> > Regards,
> >
> > Guido.
> >
> >
> > On 11/10/12 08:40, Pavel Kogan wrote:
> >
> > Thanks Guido, Pawel,
> >
> > I will try using HAProxy + holding N concurrent connections on the client
> > side.
> > I want clear for myself some point about concurrent connections:
> > 1) What is reasonable limit of concurrent connections?
> > 2) Concurrent connections = separate generated pbc clients or single
> shared
> > pbc client?
> > 3) Will connection timeout if no requests would be done for some period?
> >
> > Pavel
> >
> > On Wed, Oct 10, 2012 at 8:57 PM, Guido Medina <guido.medina at temetra.com>
> > wrote:
> >>
> >> From that perspective, for now it is better to treat the client as you
> >> would treat a JDBC DataSource pool, the tidy up comes when connecting
> the
> >> client, either one node or many, the client will behave better if it
> has no
> >> knowledge of whats going on at the cluster side, of course, that's as of
> >> 1.0.6, so that might change.
> >>
> >> He could try to connect to one node with a pool from 8 to 16 concurrent
> >> connections and start from there, then, when talking to a cluster, he
> needs
> >> the balancer in the middle, main reason is because Riak expect you to
> >> connect to all nodes (it will simply behave better), otherwise it will
> be
> >> overloaded at one node and give you IOExceptions from time to time.
> >>
> >> Hope that helps,
> >>
> >> Guido.
> >>
> >>
> >> On 10/10/12 19:24, kamiseq wrote:
> >>>
> >>> ok, you have 100% point here, on the other hand I think pavel looks
> >>> for some guidance how to improve performance on client side, so he can
> >>> be 100% sure he is not wasting time on something. this is maybe
> >>> premature optimization but it maybe also good position to understand
> >>> library and enter new world of riak
> >>>
> >>> pozdrawiam
> >>> Paweł Kamiński
> >>>
> >>> kamiseq at gmail.com
> >>> pkaminski.prv at gmail.com
> >>> ______________________
> >>>
> >>>
> >>> On 10 October 2012 17:30, Guido Medina <guido.medina at temetra.com>
> wrote:
> >>>>
> >>>> In fact, as more nodes, you might be surprised it that it might be
> >>>> faster....see my point? Riak is a lot of things, 1st you have to be
> >>>> aware of
> >>>> the hashing, hashmap, how a key gets copied into different nodes, how
> >>>> one or
> >>>> more nodes are responsible for a key, etc...so it is not that simple.
> >>>>
> >>>>
> >>>> On 10/10/12 16:28, Guido Medina wrote:
> >>>>
> >>>> That's why I keep pushing to one answer, Riak is not meant to be in
> one
> >>>> cluster, you are removing the external factors and CAP settings you
> will
> >>>> be
> >>>> using, and it won't be linear, you could get the same results with
> RW=2
> >>>> with
> >>>> 3, 4 and 5 nodes, there are several factors that will influence your
> >>>> benchmark, I would start with 3 nodes, up to 5 by altering those
> >>>> numbers,
> >>>> then you could end up with a formula which I asure you, it won't be
> >>>> linear.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Guido.
> >>>>
> >>>> On 10/10/12 16:19, Pavel Kogan wrote:
> >>>>
> >>>> I understand that load balancing is a final solution, but I want to
> >>>> benchmark single node.
> >>>> If I knew that I can load single node with N requests / sec, I could
> >>>> assume
> >>>> that after load balancing over 5 nodes my throughput limit will
> increase
> >>>> linearly.
> >>>>
> >>>> Pavel
> >>>>
> >>>> On Wed, Oct 10, 2012 at 2:51 PM, Guido Medina <
> guido.medina at temetra.com>
> >>>> wrote:
> >>>>>
> >>>>> The answer is there, create a client config with N pooled connections
> >>>>> to
> >>>>> your load balancer whatever you are using, I know HA proxy supports
> the
> >>>>> PBC
> >>>>> config (TCP based) which is faster than HTTP client, and hence my
> >>>>> recommendation.
> >>>>>
> >>>>> Say, a non-clustered client config with N connections to
> balancer_host
> >>>>> at
> >>>>> 8087 and your balancer_host connected to EACH node, that's the way to
> >>>>> go,
> >>>>> the rest is about the CAP level you want to support which will impact
> >>>>> your
> >>>>> performance vs integrity. Up to you.
> >>>>>
> >>>>> CAP doc:
> >>>>>
> >>>>>
> http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/
> >>>>>
> >>>>> Guido.
> >>>>>
> >>>>>
> >>>>> On 10/10/12 13:33, Pavel Kogan wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> The node is OK and not down.
> >>>>> I have a way to do load balancing externally to JAVA Client.
> >>>>> I am evaluating Riak for using in my company and want to measure
> >>>>> maximal
> >>>>> throughput vs single node.
> >>>>>
> >>>>> Thanks,
> >>>>>     Pavel
> >>>>>
> >>>>> On Wed, Oct 10, 2012 at 2:13 PM, Guido Medina
> >>>>> <guido.medina at temetra.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> That question has been answered few times, here is my old answer:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>>    It is the Java client which to be honest, doesn't handle well one
> >>>>>> node
> >>>>>> going down, so, for example, in my company we use HA proxy for that,
> >>>>>> here
> >>>>>> is
> >>>>>> a starting configuration: https://gist.github.com/1507077
> >>>>>>
> >>>>>>    Once we switched to HA proxy we just use a simple client without
> >>>>>> cluster
> >>>>>> config, so the Java client doesn't know anything about the load
> >>>>>> balancing
> >>>>>> going on. It works well, I can upgrade and restart servers without
> our
> >>>>>> Java
> >>>>>> application be complaining.
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Guido.
> >>>>>>
> >>>>>>
> >>>>>> On 10/10/12 12:58, Pavel Kogan wrote:
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> I will try this solution.
> >>>>>>
> >>>>>> Pavel
> >>>>>>
> >>>>>> On Wed, Oct 10, 2012 at 1:51 PM, kamiseq <kamiseq at gmail.com> wrote:
> >>>>>>>
> >>>>>>> well I asked same question few days ago (maybe 2 weeks form now)
> and
> >>>>>>> the answer was that yes sharing client is thread safe and all you
> >>>>>>> should do is to create new bucket instance on every request
> >>>>>>>
> >>>>>>> pozdrawiam
> >>>>>>> Paweł Kamiński
> >>>>>>>
> >>>>>>> kamiseq at gmail.com
> >>>>>>> pkaminski.prv at gmail.com
> >>>>>>> ______________________
> >>>>>>>
> >>>>>>>
> >>>>>>> On 10 October 2012 09:25, Pavel Kogan <pavel.kogan at cortica.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> 1) Is it ok to share a single pbc client object between 50
> threads?
> >>>>>>>> Should
> >>>>>>>> it be protected by lock ?
> >>>>>>>> 2) I didn't do load balancing between nodes yet, cause I want to
> >>>>>>>> understand
> >>>>>>>> better throughput limit. I am planning to do it for much higher
> >>>>>>>> throughput.
> >>>>>>>>
> >>>>>>>> Pavel
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Oct 10, 2012 at 9:21 AM, kamiseq <kamiseq at gmail.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>> maybe the good start is to share pbclient object and only create
> >>>>>>>>> bucket per request, you will save few steps on client
> >>>>>>>>> configuration.
> >>>>>>>>> have you tried balancing requests to cluster and distribute them
> >>>>>>>>> over
> >>>>>>>>> all
> >>>>>>>>> nodes?
> >>>>>>>>>
> >>>>>>>>> pozdrawiam
> >>>>>>>>> Paweł Kamiński
> >>>>>>>>>
> >>>>>>>>> kamiseq at gmail.com
> >>>>>>>>> pkaminski.prv at gmail.com
> >>>>>>>>> ______________________
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 10 October 2012 06:18, Pavel Kogan <pavel.kogan at cortica.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I have Riak cluster consisting of 5 nodes that contains about 30
> >>>>>>>>>> millions of
> >>>>>>>>>> keys (35% of capacity according to Riak Control).
> >>>>>>>>>> Currently we have single JAVA client reading and writing records
> >>>>>>>>>> to
> >>>>>>>>>> same
> >>>>>>>>>> node. I need some tips, how to use the client efficiently
> >>>>>>>>>> to reach maximal throughput - I would like to be able to
> >>>>>>>>>> read/write
> >>>>>>>>>> up
> >>>>>>>>>> to
> >>>>>>>>>> 100 records/sec on 1Gbit network. Currently I get a lot
> >>>>>>>>>> of JAVA socket exceptions after a while (even for the much
> slower
> >>>>>>>>>> rate -
> >>>>>>>>>> 10
> >>>>>>>>>> records/sec), after which I  need to restart client and node.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>     Pavel
> >>>>>>>>>>
> >>>>>>>>>> P.S: My client using 50 threads and pbc client is created and
> >>>>>>>>>> shut-downed
> >>>>>>>>>> per request.
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> riak-users mailing list
> >>>>>>>>>> riak-users at lists.basho.com
> >>>>>>>>>>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> riak-users mailing list
> >>>>>> riak-users at lists.basho.com
> >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> riak-users mailing list
> >>>>>> riak-users at lists.basho.com
> >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> riak-users mailing list
> >>>>> riak-users at lists.basho.com
> >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> riak-users mailing list
> >>>> riak-users at lists.basho.com
> >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>
> >>
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121014/24bc3ced/attachment.html>


More information about the riak-users mailing list