Riak JAVA Client Performance

Pavel Kogan pavel.kogan at cortica.com
Thu Oct 11 03:40:07 EDT 2012


Thanks Guido, Pawel,

I will try using HAProxy + holding N concurrent connections on the client
side.
I want clear for myself some point about concurrent connections:
1) What is reasonable limit of concurrent connections?
2) Concurrent connections = separate generated pbc clients or single shared
pbc client?
3) Will connection timeout if no requests would be done for some period?

Pavel

On Wed, Oct 10, 2012 at 8:57 PM, Guido Medina <guido.medina at temetra.com>wrote:

> From that perspective, for now it is better to treat the client as you
> would treat a JDBC DataSource pool, the tidy up comes when connecting the
> client, either one node or many, the client will behave better if it has no
> knowledge of whats going on at the cluster side, of course, that's as of
> 1.0.6, so that might change.
>
> He could try to connect to one node with a pool from 8 to 16 concurrent
> connections and start from there, then, when talking to a cluster, he needs
> the balancer in the middle, main reason is because Riak expect you to
> connect to all nodes (it will simply behave better), otherwise it will be
> overloaded at one node and give you IOExceptions from time to time.
>
> Hope that helps,
>
> Guido.
>
>
> On 10/10/12 19:24, kamiseq wrote:
>
>> ok, you have 100% point here, on the other hand I think pavel looks
>> for some guidance how to improve performance on client side, so he can
>> be 100% sure he is not wasting time on something. this is maybe
>> premature optimization but it maybe also good position to understand
>> library and enter new world of riak
>>
>> pozdrawiam
>> Paweł Kamiński
>>
>> kamiseq at gmail.com
>> pkaminski.prv at gmail.com
>> ______________________
>>
>>
>> On 10 October 2012 17:30, Guido Medina <guido.medina at temetra.com> wrote:
>>
>>> In fact, as more nodes, you might be surprised it that it might be
>>> faster....see my point? Riak is a lot of things, 1st you have to be
>>> aware of
>>> the hashing, hashmap, how a key gets copied into different nodes, how
>>> one or
>>> more nodes are responsible for a key, etc...so it is not that simple.
>>>
>>>
>>> On 10/10/12 16:28, Guido Medina wrote:
>>>
>>> That's why I keep pushing to one answer, Riak is not meant to be in one
>>> cluster, you are removing the external factors and CAP settings you will
>>> be
>>> using, and it won't be linear, you could get the same results with RW=2
>>> with
>>> 3, 4 and 5 nodes, there are several factors that will influence your
>>> benchmark, I would start with 3 nodes, up to 5 by altering those numbers,
>>> then you could end up with a formula which I asure you, it won't be
>>> linear.
>>>
>>> Regards,
>>>
>>> Guido.
>>>
>>> On 10/10/12 16:19, Pavel Kogan wrote:
>>>
>>> I understand that load balancing is a final solution, but I want to
>>> benchmark single node.
>>> If I knew that I can load single node with N requests / sec, I could
>>> assume
>>> that after load balancing over 5 nodes my throughput limit will increase
>>> linearly.
>>>
>>> Pavel
>>>
>>> On Wed, Oct 10, 2012 at 2:51 PM, Guido Medina <guido.medina at temetra.com>
>>> wrote:
>>>
>>>> The answer is there, create a client config with N pooled connections to
>>>> your load balancer whatever you are using, I know HA proxy supports the
>>>> PBC
>>>> config (TCP based) which is faster than HTTP client, and hence my
>>>> recommendation.
>>>>
>>>> Say, a non-clustered client config with N connections to balancer_host
>>>> at
>>>> 8087 and your balancer_host connected to EACH node, that's the way to
>>>> go,
>>>> the rest is about the CAP level you want to support which will impact
>>>> your
>>>> performance vs integrity. Up to you.
>>>>
>>>> CAP doc:
>>>> http://docs.basho.com/riak/**latest/tutorials/fast-track/**
>>>> Tunable-CAP-Controls-in-Riak/<http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/>
>>>>
>>>> Guido.
>>>>
>>>>
>>>> On 10/10/12 13:33, Pavel Kogan wrote:
>>>>
>>>> Hi,
>>>>
>>>> The node is OK and not down.
>>>> I have a way to do load balancing externally to JAVA Client.
>>>> I am evaluating Riak for using in my company and want to measure maximal
>>>> throughput vs single node.
>>>>
>>>> Thanks,
>>>>     Pavel
>>>>
>>>> On Wed, Oct 10, 2012 at 2:13 PM, Guido Medina <guido.medina at temetra.com
>>>> >
>>>> wrote:
>>>>
>>>>> That question has been answered few times, here is my old answer:
>>>>>
>>>>> Hi,
>>>>>
>>>>>    It is the Java client which to be honest, doesn't handle well one
>>>>> node
>>>>> going down, so, for example, in my company we use HA proxy for that,
>>>>> here
>>>>> is
>>>>> a starting configuration: https://gist.github.com/**1507077<https://gist.github.com/1507077>
>>>>>
>>>>>    Once we switched to HA proxy we just use a simple client without
>>>>> cluster
>>>>> config, so the Java client doesn't know anything about the load
>>>>> balancing
>>>>> going on. It works well, I can upgrade and restart servers without our
>>>>> Java
>>>>> application be complaining.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Guido.
>>>>>
>>>>>
>>>>> On 10/10/12 12:58, Pavel Kogan wrote:
>>>>>
>>>>> Thanks,
>>>>>
>>>>> I will try this solution.
>>>>>
>>>>> Pavel
>>>>>
>>>>> On Wed, Oct 10, 2012 at 1:51 PM, kamiseq <kamiseq at gmail.com> wrote:
>>>>>
>>>>>> well I asked same question few days ago (maybe 2 weeks form now) and
>>>>>> the answer was that yes sharing client is thread safe and all you
>>>>>> should do is to create new bucket instance on every request
>>>>>>
>>>>>> pozdrawiam
>>>>>> Paweł Kamiński
>>>>>>
>>>>>> kamiseq at gmail.com
>>>>>> pkaminski.prv at gmail.com
>>>>>> ______________________
>>>>>>
>>>>>>
>>>>>> On 10 October 2012 09:25, Pavel Kogan <pavel.kogan at cortica.com>
>>>>>> wrote:
>>>>>>
>>>>>>> 1) Is it ok to share a single pbc client object between 50 threads?
>>>>>>> Should
>>>>>>> it be protected by lock ?
>>>>>>> 2) I didn't do load balancing between nodes yet, cause I want to
>>>>>>> understand
>>>>>>> better throughput limit. I am planning to do it for much higher
>>>>>>> throughput.
>>>>>>>
>>>>>>> Pavel
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 10, 2012 at 9:21 AM, kamiseq <kamiseq at gmail.com> wrote:
>>>>>>>
>>>>>>>> maybe the good start is to share pbclient object and only create
>>>>>>>> bucket per request, you will save few steps on client configuration.
>>>>>>>> have you tried balancing requests to cluster and distribute them
>>>>>>>> over
>>>>>>>> all
>>>>>>>> nodes?
>>>>>>>>
>>>>>>>> pozdrawiam
>>>>>>>> Paweł Kamiński
>>>>>>>>
>>>>>>>> kamiseq at gmail.com
>>>>>>>> pkaminski.prv at gmail.com
>>>>>>>> ______________________
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10 October 2012 06:18, Pavel Kogan <pavel.kogan at cortica.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I have Riak cluster consisting of 5 nodes that contains about 30
>>>>>>>>> millions of
>>>>>>>>> keys (35% of capacity according to Riak Control).
>>>>>>>>> Currently we have single JAVA client reading and writing records to
>>>>>>>>> same
>>>>>>>>> node. I need some tips, how to use the client efficiently
>>>>>>>>> to reach maximal throughput - I would like to be able to read/write
>>>>>>>>> up
>>>>>>>>> to
>>>>>>>>> 100 records/sec on 1Gbit network. Currently I get a lot
>>>>>>>>> of JAVA socket exceptions after a while (even for the much slower
>>>>>>>>> rate -
>>>>>>>>> 10
>>>>>>>>> records/sec), after which I  need to restart client and node.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>     Pavel
>>>>>>>>>
>>>>>>>>> P.S: My client using 50 threads and pbc client is created and
>>>>>>>>> shut-downed
>>>>>>>>> per request.
>>>>>>>>>
>>>>>>>>> ______________________________**_________________
>>>>>>>>> riak-users mailing list
>>>>>>>>> riak-users at lists.basho.com
>>>>>>>>> http://lists.basho.com/**mailman/listinfo/riak-users_**
>>>>>>>>> lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> ______________________________**_________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>>>>
>>>>>
>>>>>
>>>>> ______________________________**_________________
>>>>> riak-users mailing list
>>>>> riak-users at lists.basho.com
>>>>> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>>>>
>>>>>
>>>>
>>>> ______________________________**_________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>>>
>>>>
>>>
>>>
>>> ______________________________**_________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>>
>>>
>
> ______________________________**_________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/**mailman/listinfo/riak-users_**lists.basho.com<http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121011/7f38836b/attachment.html>


More information about the riak-users mailing list