Riak JAVA Client Performance

Guido Medina guido.medina at temetra.com
Thu Oct 11 05:39:59 EDT 2012


Hi Pavel,

   I'm not an expert with the pool size, but depending on your average 
key size and nodes you could tune it to your needs, regarding the 
client, a single shared client instance will suffice, there is a retrier 
parameter which says how many times Riak will retry your operation 
before returning you an exception (3 by default), and there is a timeout 
on acquiring the connection, this is an example config:

The pool size here is for 4 nodes cluster kind of guessing for Erlang 8 
threads per node to allow Riak nodes do other things too, remember they 
have to sync their data between the nodes:

*host* = your balancer host
*port* = your balancer port

*final PBClientConfig clientConfig=new 
PBClientConfig.Builder().withHost(host).withPort(port).withPoolSize(32).withConnectionTimeoutMillis(5000).build();**
final IRiakClient riakClient=RiakFactory.newClient(clientConfig);
*
That we have it running with no issues, the pool size depends on your 
needs and data size, you could run with a pool size of 50 to a 100 if 
your keys are really small, you will have to try your own values.

Regards,

Guido.

On 11/10/12 08:40, Pavel Kogan wrote:
> Thanks Guido, Pawel,
>
> I will try using HAProxy + holding N concurrent connections on the 
> client side.
> I want clear for myself some point about concurrent connections:
> 1) What is reasonable limit of concurrent connections?
> 2) Concurrent connections = separate generated pbc clients or single 
> shared pbc client?
> 3) Will connection timeout if no requests would be done for some period?
>
> Pavel
>
> On Wed, Oct 10, 2012 at 8:57 PM, Guido Medina 
> <guido.medina at temetra.com <mailto:guido.medina at temetra.com>> wrote:
>
>     From that perspective, for now it is better to treat the client as
>     you would treat a JDBC DataSource pool, the tidy up comes when
>     connecting the client, either one node or many, the client will
>     behave better if it has no knowledge of whats going on at the
>     cluster side, of course, that's as of 1.0.6, so that might change.
>
>     He could try to connect to one node with a pool from 8 to 16
>     concurrent connections and start from there, then, when talking to
>     a cluster, he needs the balancer in the middle, main reason is
>     because Riak expect you to connect to all nodes (it will simply
>     behave better), otherwise it will be overloaded at one node and
>     give you IOExceptions from time to time.
>
>     Hope that helps,
>
>     Guido.
>
>
>     On 10/10/12 19:24, kamiseq wrote:
>
>         ok, you have 100% point here, on the other hand I think pavel
>         looks
>         for some guidance how to improve performance on client side,
>         so he can
>         be 100% sure he is not wasting time on something. this is maybe
>         premature optimization but it maybe also good position to
>         understand
>         library and enter new world of riak
>
>         pozdrawiam
>         Paweł Kamiński
>
>         kamiseq at gmail.com <mailto:kamiseq at gmail.com>
>         pkaminski.prv at gmail.com <mailto:pkaminski.prv at gmail.com>
>         ______________________
>
>
>         On 10 October 2012 17:30, Guido Medina
>         <guido.medina at temetra.com <mailto:guido.medina at temetra.com>>
>         wrote:
>
>             In fact, as more nodes, you might be surprised it that it
>             might be
>             faster....see my point? Riak is a lot of things, 1st you
>             have to be aware of
>             the hashing, hashmap, how a key gets copied into different
>             nodes, how one or
>             more nodes are responsible for a key, etc...so it is not
>             that simple.
>
>
>             On 10/10/12 16:28, Guido Medina wrote:
>
>             That's why I keep pushing to one answer, Riak is not meant
>             to be in one
>             cluster, you are removing the external factors and CAP
>             settings you will be
>             using, and it won't be linear, you could get the same
>             results with RW=2 with
>             3, 4 and 5 nodes, there are several factors that will
>             influence your
>             benchmark, I would start with 3 nodes, up to 5 by altering
>             those numbers,
>             then you could end up with a formula which I asure you, it
>             won't be linear.
>
>             Regards,
>
>             Guido.
>
>             On 10/10/12 16:19, Pavel Kogan wrote:
>
>             I understand that load balancing is a final solution, but
>             I want to
>             benchmark single node.
>             If I knew that I can load single node with N requests /
>             sec, I could assume
>             that after load balancing over 5 nodes my throughput limit
>             will increase
>             linearly.
>
>             Pavel
>
>             On Wed, Oct 10, 2012 at 2:51 PM, Guido Medina
>             <guido.medina at temetra.com <mailto:guido.medina at temetra.com>>
>             wrote:
>
>                 The answer is there, create a client config with N
>                 pooled connections to
>                 your load balancer whatever you are using, I know HA
>                 proxy supports the PBC
>                 config (TCP based) which is faster than HTTP client,
>                 and hence my
>                 recommendation.
>
>                 Say, a non-clustered client config with N connections
>                 to balancer_host at
>                 8087 and your balancer_host connected to EACH node,
>                 that's the way to go,
>                 the rest is about the CAP level you want to support
>                 which will impact your
>                 performance vs integrity. Up to you.
>
>                 CAP doc:
>                 http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/
>
>                 Guido.
>
>
>                 On 10/10/12 13:33, Pavel Kogan wrote:
>
>                 Hi,
>
>                 The node is OK and not down.
>                 I have a way to do load balancing externally to JAVA
>                 Client.
>                 I am evaluating Riak for using in my company and want
>                 to measure maximal
>                 throughput vs single node.
>
>                 Thanks,
>                     Pavel
>
>                 On Wed, Oct 10, 2012 at 2:13 PM, Guido Medina
>                 <guido.medina at temetra.com
>                 <mailto:guido.medina at temetra.com>>
>                 wrote:
>
>                     That question has been answered few times, here is
>                     my old answer:
>
>                     Hi,
>
>                        It is the Java client which to be honest,
>                     doesn't handle well one node
>                     going down, so, for example, in my company we use
>                     HA proxy for that, here
>                     is
>                     a starting configuration:
>                     https://gist.github.com/1507077
>
>                        Once we switched to HA proxy we just use a
>                     simple client without
>                     cluster
>                     config, so the Java client doesn't know anything
>                     about the load balancing
>                     going on. It works well, I can upgrade and restart
>                     servers without our
>                     Java
>                     application be complaining.
>
>                     Regards,
>
>                     Guido.
>
>
>                     On 10/10/12 12:58, Pavel Kogan wrote:
>
>                     Thanks,
>
>                     I will try this solution.
>
>                     Pavel
>
>                     On Wed, Oct 10, 2012 at 1:51 PM, kamiseq
>                     <kamiseq at gmail.com <mailto:kamiseq at gmail.com>> wrote:
>
>                         well I asked same question few days ago (maybe
>                         2 weeks form now) and
>                         the answer was that yes sharing client is
>                         thread safe and all you
>                         should do is to create new bucket instance on
>                         every request
>
>                         pozdrawiam
>                         Paweł Kamiński
>
>                         kamiseq at gmail.com <mailto:kamiseq at gmail.com>
>                         pkaminski.prv at gmail.com
>                         <mailto:pkaminski.prv at gmail.com>
>                         ______________________
>
>
>                         On 10 October 2012 09:25, Pavel Kogan
>                         <pavel.kogan at cortica.com
>                         <mailto:pavel.kogan at cortica.com>> wrote:
>
>                             1) Is it ok to share a single pbc client
>                             object between 50 threads?
>                             Should
>                             it be protected by lock ?
>                             2) I didn't do load balancing between
>                             nodes yet, cause I want to
>                             understand
>                             better throughput limit. I am planning to
>                             do it for much higher
>                             throughput.
>
>                             Pavel
>
>
>                             On Wed, Oct 10, 2012 at 9:21 AM, kamiseq
>                             <kamiseq at gmail.com
>                             <mailto:kamiseq at gmail.com>> wrote:
>
>                                 maybe the good start is to share
>                                 pbclient object and only create
>                                 bucket per request, you will save few
>                                 steps on client configuration.
>                                 have you tried balancing requests to
>                                 cluster and distribute them over
>                                 all
>                                 nodes?
>
>                                 pozdrawiam
>                                 Paweł Kamiński
>
>                                 kamiseq at gmail.com
>                                 <mailto:kamiseq at gmail.com>
>                                 pkaminski.prv at gmail.com
>                                 <mailto:pkaminski.prv at gmail.com>
>                                 ______________________
>
>
>                                 On 10 October 2012 06:18, Pavel Kogan
>                                 <pavel.kogan at cortica.com
>                                 <mailto:pavel.kogan at cortica.com>>
>                                 wrote:
>
>                                     Hi all,
>
>                                     I have Riak cluster consisting of
>                                     5 nodes that contains about 30
>                                     millions of
>                                     keys (35% of capacity according to
>                                     Riak Control).
>                                     Currently we have single JAVA
>                                     client reading and writing records to
>                                     same
>                                     node. I need some tips, how to use
>                                     the client efficiently
>                                     to reach maximal throughput - I
>                                     would like to be able to read/write
>                                     up
>                                     to
>                                     100 records/sec on 1Gbit network.
>                                     Currently I get a lot
>                                     of JAVA socket exceptions after a
>                                     while (even for the much slower
>                                     rate -
>                                     10
>                                     records/sec), after which I  need
>                                     to restart client and node.
>
>                                     Thanks,
>                                         Pavel
>
>                                     P.S: My client using 50 threads
>                                     and pbc client is created and
>                                     shut-downed
>                                     per request.
>
>                                     _______________________________________________
>                                     riak-users mailing list
>                                     riak-users at lists.basho.com
>                                     <mailto:riak-users at lists.basho.com>
>                                     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
>
>                     _______________________________________________
>                     riak-users mailing list
>                     riak-users at lists.basho.com
>                     <mailto:riak-users at lists.basho.com>
>                     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>                     _______________________________________________
>                     riak-users mailing list
>                     riak-users at lists.basho.com
>                     <mailto:riak-users at lists.basho.com>
>                     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>                 _______________________________________________
>                 riak-users mailing list
>                 riak-users at lists.basho.com
>                 <mailto:riak-users at lists.basho.com>
>                 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
>             _______________________________________________
>             riak-users mailing list
>             riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>             http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>     _______________________________________________
>     riak-users mailing list
>     riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>     http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121011/a52eec06/attachment.html>


More information about the riak-users mailing list