Java Riak client can't handle a Riak node failure?

Vanessa Williams vanessa.williams at thoughtwire.ca
Tue Oct 13 17:50:09 EDT 2015


Alexander, thanks for that reminder. Yes, n_val = 2 would suit us better.
I'll look into getting that tested.

Regards,
Vanessa

On Thu, Oct 8, 2015 at 1:04 PM, Alexander Sicular <siculars at gmail.com>
wrote:

> Greetings and salutations, Vanessa.
>
> I am obliged to point out that running n_val of 3 (or more) is highly
> detrimental in a cluster size smaller than 5 nodes. Needless to say, it is
> not recommended. Let's talk about why that is the case for a moment.
>
> The default level of abstraction in Riak is the virtual node, or vnode.
> The vnode represents a segment of the ring (ring_size, configurable in
> steps of power of 2, default 64, min 8, max 1024.) The "ring" is the number
> line 0 - 2^160 which represents the output of the SHA hash.
>
> Riak achieves high availability through replication. Riak replicates data
> to vnodes. A hypothetical replica set may be, for example, set[vnode 1,
> vnode 10, vnode 20]. Note, I said vnodes. Not physical nodes. And therein
> lies the concern. Considering a default ring size of 64 and a default
> replica count of 3, the minimum recommended production deployment of a Riak
> cluster should be 5 due to the fact that in that circumstance every replica
> set combination is guaranteed to have each vnode on a distinct physical
> node. Anything less than that will certainly have some combinations of
> replica sets which have two of its copies on the same physical host. Note I
> said some combinations. Some fraction of node replica set combinations will
> have two of their copies allocated to one physical machine.
>
> You can see where I'm going with this. Firstly, performance will be
> negatively impacted when writing more than one copy of data to the same
> physical hardware, aka disk. But more importantly, you are negating Riak's
> high availability mechanic. If you lost any given physical node you would
> lose access to two copies of the set of data which had two replicas on that
> node.
>
> Riak is designed to withstand loss of any two physical nodes while
> maintaining access to 100% of your corpus assuming the fact that you are
> running the default settings and have deployed 5 nodes.
>
> Here is the rule of thumb that I recommend (me personally, not Basho) to
> folks looking to deploy clusters with less than 5 nodes:
>
> 1,2 nodes: n_val 1
> 3,4 nodes: n_val 2
> 5+ nodes: n_val 3
>
> In summary, please consider reconfiguring your production deployment.
>
> Sincerely,
> Alexander
>
> @siculars
> http://siculars.posthaven.com
>
> Sent from my iRotaryPhone
>
> On Oct 7, 2015, at 19:56, Vanessa Williams <
> vanessa.williams at thoughtwire.ca> wrote:
>
> Hi Dmitri, what would be the benefit of r=2, exactly? It isn't necessary
> to trigger read-repair, is it? If it's important I'd rather try it sooner
> than later...
>
> Regards,
> Vanessa
>
>
>
> On Wed, Oct 7, 2015 at 4:02 PM, Dmitri Zagidulin <dzagidulin at basho.com>
> wrote:
>
>> Glad you sorted it out!
>>
>> (I do want to encourage you to bump your R setting to at least 2, though.
>> Run some tests -- I think you'll find that the difference in speed will not
>> be noticeable, but you do get a lot more data resilience with 2.)
>>
>> On Wed, Oct 7, 2015 at 6:24 PM, Vanessa Williams <
>> vanessa.williams at thoughtwire.ca> wrote:
>>
>>> Hi Dmitri, well...we solved our problem to our satisfaction but it
>>> turned out to be something unexpected.
>>>
>>> The keys were two properties mentioned in a blog post on "configuring
>>> Riak’s oft-subtle behavioral characteristics":
>>> http://basho.com/posts/technical/riaks-config-behaviors-part-4/
>>>
>>> notfound_ok= false
>>> basic_quorum=true
>>>
>>> The 2nd one just makes things a little faster, but the first one is the
>>> one whose default value of true was killing us.
>>>
>>> With r=1 and notfound_ok=true (default) the first node to respond, if it
>>> didn't find the requested key, the authoritative answer was "this key is
>>> not found". Not what we were expecting at all.
>>>
>>> With the changed settings, it will wait for a quorum of responses and
>>> only if *no one* finds the key will "not found" be returned. Perfect.
>>> (Without this setting it would wait for all responses, not ideal.)
>>>
>>> Now there is only one snag, which is that if the Riak node the client
>>> connects to goes down, there will be no communication and we have a
>>> problem. This is easily solvable with a load-balancer, though for
>>> complicated reasons we actually don't need to do that right now. It's just
>>> acceptable for us temporarily. Later, we'll get the load-balancer working
>>> and even that won't be a problem.
>>>
>>> I *think* we're ok now. Thanks for your help!
>>>
>>> Regards,
>>> Vanessa
>>>
>>>
>>>
>>> On Wed, Oct 7, 2015 at 9:33 AM, Dmitri Zagidulin <dzagidulin at basho.com>
>>> wrote:
>>>
>>>> Yeah, definitely find out what the sysadmin's experience was, with the
>>>> load balancer. It could have just been a wrong configuration or something.
>>>>
>>>> And yes, that's the documentation page I recommend -
>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/
>>>> Just set up HAProxy, and point your Java clients to its IP.
>>>>
>>>> The drawbacks to load-balancing on the java client side (yes, the
>>>> cluster object) instead of a standalone load balancer like HAProxy, are the
>>>> following:
>>>>
>>>> 1) Adding node means code changes (or at very least, config file
>>>> changes) rolled out to all your clients. Which turns out to be a pretty
>>>> serious hassle. Instead, HAProxy allows you to add or remove nodes without
>>>> changing any java code or config files.
>>>>
>>>> 2) Performance. We've ran many tests to compare performance, and
>>>> client-side load balancing results in significantly lower throughput than
>>>> you'd have using haproxy (or nginx). (Specifically, you actually want to
>>>> use the 'leastconn' load balancing algorithm with HAProxy, instead of round
>>>> robin).
>>>>
>>>> 3) The health check on the client side (so that the java load balancer
>>>> can tell when a remote node is down) is much less intelligent than a
>>>> dedicated load balancer would provide. With something like HAProxy, you
>>>> should be able to take down nodes with no ill effects for the client code.
>>>>
>>>> Now, if you load balance on the client side and you take a node down,
>>>> it's not supposed to stop working completely. (I'm not sure why it's
>>>> failing for you, we can investigate, but it'll be easier to just use a load
>>>> balancer). It should throw an error or two, but then start working again
>>>> (on the retry).
>>>>
>>>> Dmitri
>>>>
>>>> On Wed, Oct 7, 2015 at 2:45 PM, Vanessa Williams <
>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>
>>>>> Hi Dmitri, thanks for the quick reply.
>>>>>
>>>>> It was actually our sysadmin who tried the load balancer approach and
>>>>> had no success, late last evening. However I haven't discussed the gory
>>>>> details with him yet. The failure he saw was at the application level (i.e.
>>>>> failure to read a key), but I don't know a) how he set up the LB or b) what
>>>>> the Java exception was, if any. I'll find that out in an hour or two and
>>>>> report back.
>>>>>
>>>>> I did find this article just now:
>>>>>
>>>>>
>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/
>>>>>
>>>>> So I suppose we'll give those suggestions a try this morning.
>>>>>
>>>>> What is the drawback to having the client connect to all 4 nodes (the
>>>>> cluster client, I assume you mean?) My understanding from reading articles
>>>>> I've found is that one of the nodes going away causes that client to fail
>>>>> as well. Is that what you mean, or are there other drawbacks as well?
>>>>>
>>>>> If there's anything else you can recommend, or links other than the
>>>>> one above you can point me to, it would be much appreciated. We expect both
>>>>> node failure and deliberate node removal for upgrade, repair, replacement,
>>>>> etc.
>>>>>
>>>>> Regards,
>>>>> Vanessa
>>>>>
>>>>> On Wed, Oct 7, 2015 at 8:29 AM, Dmitri Zagidulin <dzagidulin at basho.com
>>>>> > wrote:
>>>>>
>>>>>> Hi Vanessa,
>>>>>>
>>>>>> Riak is definitely meant to run behind a load balancer. (Or, at the
>>>>>> worst case, to be load-balanced on the client side. That is, all clients
>>>>>> connect to all 4 nodes).
>>>>>>
>>>>>> When you say "we did try putting all 4 Riak nodes behind a
>>>>>> load-balancer and pointing the clients at it, but it didn't help." -- what
>>>>>> do you mean exactly, by "it didn't help"? What happened when you tried
>>>>>> using the load balancer?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 1:57 PM, Vanessa Williams <
>>>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>>>
>>>>>>> Hi all, we are still (for a while longer) using Riak 1.4 and the
>>>>>>> matching Java client. The client(s) connect to one node in the cluster
>>>>>>> (since that's all it can do in this client version). The cluster itself has
>>>>>>> 4 nodes (sorry, we can't use 5 in this scenario). There are 2 separate
>>>>>>> clients.
>>>>>>>
>>>>>>> We've tried both n_val = 3 and n_val=4. We achieve
>>>>>>> consistency-by-writes by setting w=all. Therefore, we only require one
>>>>>>> successful read (r=1).
>>>>>>>
>>>>>>> When all nodes are up, everything is fine. If one node fails, the
>>>>>>> clients can no longer read any keys at all. There's an exception like this:
>>>>>>>
>>>>>>> com.basho.riak.client.RiakRetryFailedException:
>>>>>>> java.net.ConnectException: Connection refused
>>>>>>>
>>>>>>> Now, it isn't possible that Riak can't operate when one node fails,
>>>>>>> so we're clearly missing something here.
>>>>>>>
>>>>>>> Note: we did try putting all 4 Riak nodes behind a load-balancer and
>>>>>>> pointing the clients at it, but it didn't help.
>>>>>>>
>>>>>>> Riak is a high-availability key-value store, so... why are we
>>>>>>> failing to achieve high-availability? Any suggestions greatly appreciated,
>>>>>>> and if more info is required I'll do my best to provide it.
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>> Vanessa
>>>>>>>
>>>>>>> --
>>>>>>> Vanessa Williams
>>>>>>> ThoughtWire Corporation
>>>>>>> http://www.thoughtwire.com
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>> riak-users at lists.basho.com
>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20151013/9eba8c0d/attachment-0002.html>


More information about the riak-users mailing list