Java Riak client can't handle a Riak node failure?

Alex Moore amoore at basho.com
Mon Feb 22 11:15:57 EST 2016


>
> That's the correct behaviour: it should return true iff a value was
> actually deleted.


Ok, if that's the case you should do another FetchValue after the deletion
(to update the response.hasValues()) field, or use the async version of the
delete function. I also noticed that we weren't passing the vclock to the
Delete function, so I added that here as well:

public boolean delete(String key) throws ExecutionException,
InterruptedException {

    // fetch in order to get the causal context
    FetchValue.Response response = fetchValue(key);

    if(response.isNotFound())
    {
        return ???; // what do we return if it doesn't exist?
    }

    DeleteValue deleteValue = new DeleteValue.Builder(new
Location(namespace, key))

.withVClock(response.getVectorClock())
                                             .build();

    final RiakFuture<Void, Location> deleteFuture =
client.executeAsync(deleteValue);

    deleteFuture.await();

    if(deleteFuture.isSuccess())
    {
        return true;
    }
    else
    {
        deleteFuture.cause(); // Cause of failure
        return false;
    }
}


Thanks,
Alex

On Mon, Feb 22, 2016 at 10:48 AM, Vanessa Williams <
vanessa.williams at thoughtwire.ca> wrote:

> See inline:
>
> On Mon, Feb 22, 2016 at 10:31 AM, Alex Moore <amoore at basho.com> wrote:
>
>> Hi Vanessa,
>>
>> You might have a problem with your delete function (depending on it's
>> return value).
>> What does the return value of the delete() function indicate?  Right now
>> if an object existed, and was deleted, the function will return true, and
>> will only return false if the object didn't exist or only consisted of
>> tombstones.
>>
>
>
> That's the correct behaviour: it should return true iff a value was
> actually deleted.
>
>
>> If you never look at the object value returned by your fetchValue(key) function, another potential optimization you could make is to only return the HEAD / metadata:
>>
>> FetchValue fv = new FetchValue.Builder(new Location(new Namespace(
>> "some_bucket"), key))
>>
>>                               .withOption(FetchValue.Option.HEAD, true)
>>                               .build();
>>
>> This would be more efficient because Riak won't have to send you the
>> values over the wire, if you only need the metadata.
>>
>>
> Thanks, I'll clean that up.
>
>
>> If you do write this up somewhere, share the link! :)
>>
>
> Will do!
>
> Regards,
> Vanessa
>
>
>>
>> Thanks,
>> Alex
>>
>>
>> On Mon, Feb 22, 2016 at 6:23 AM, Vanessa Williams <
>> vanessa.williams at thoughtwire.ca> wrote:
>>
>>> Hi Dmitri, this thread is old, but I read this part of your answer
>>> carefully:
>>>
>>> You can use the following strategies to prevent stale values, in
>>>> increasing order of security/preference:
>>>> 1) Use timestamps (and not pass in vector clocks/causal context). This
>>>> is ok if you're not editing objects, or you're ok with a bit of risk of
>>>> stale values.
>>>> 2) Use causal context correctly (which means, read-before-you-write --
>>>> in fact, the Update operation in the java client does this for you, I
>>>> think). And if Riak can't determine which version is correct, it will fall
>>>> back on timestamps.
>>>> 3) Turn on siblings, for that bucket or bucket type.  That way, Riak
>>>> will still try to use causal context to decide the right value. But if it
>>>> can't decide, it will store BOTH values, and give them back to you on the
>>>> next read, so that your application can decide which is the correct one.
>>>
>>>
>>> I decided on strategy #2. What I am hoping for is some validation that
>>> the code we use to "get", "put", and "delete" is correct in that context,
>>> or if it could be simplified in some cases. Not we are using delete-mode
>>> "immediate" and no duplicates.
>>>
>>> In their shortest possible forms, here are the three methods I'd like
>>> some feedback on (note, they're being used in production and haven't caused
>>> any problems yet, however we have very few writes in production so the lack
>>> of problems doesn't support the conclusion that the implementation is
>>> correct.) Note all argument-checking, exception-handling, and logging
>>> removed for clarity. *I'm mostly concerned about correct use of causal
>>> context and response.isNotFound and response.hasValues. *Is there
>>> anything I could/should have left out?
>>>
>>>     public RiakClient(String name, com.basho.riak.client.api.RiakClient
>>> client)
>>>     {
>>>         this.name = name;
>>>         this.namespace = new Namespace(name);
>>>         this.client = client;
>>>     }
>>>
>>>     public byte[] get(String key) throws ExecutionException,
>>> InterruptedException {
>>>
>>>         FetchValue.Response response = fetchValue(key);
>>>         if (!response.isNotFound())
>>>         {
>>>             RiakObject riakObject = response.getValue(RiakObject.class);
>>>             return riakObject.getValue().getValue();
>>>         }
>>>         return null;
>>>     }
>>>
>>>     public void put(String key, byte[] value) throws ExecutionException,
>>> InterruptedException {
>>>
>>>         // fetch in order to get the causal context
>>>         FetchValue.Response response = fetchValue(key);
>>>         RiakObject storeObject = new
>>>
>>> RiakObject().setValue(BinaryValue.create(value)).setContentType("binary/octet-stream");
>>>         StoreValue.Builder builder =
>>>             new StoreValue.Builder(storeObject).withLocation(new
>>> Location(namespace, key));
>>>         if (response.getVectorClock() != null) {
>>>             builder = builder.withVectorClock(response.getVectorClock());
>>>         }
>>>         StoreValue storeValue = builder.build();
>>>         client.execute(storeValue);
>>>     }
>>>
>>>     public boolean delete(String key) throws ExecutionException,
>>> InterruptedException {
>>>
>>>         // fetch in order to get the causal context
>>>         FetchValue.Response response = fetchValue(key);
>>>         if (!response.isNotFound())
>>>         {
>>>             DeleteValue deleteValue = new DeleteValue.Builder(new
>>> Location(namespace, key)).build();
>>>             client.execute(deleteValue);
>>>         }
>>>         return !response.isNotFound() || !response.hasValues();
>>>     }
>>>
>>>
>>> Any comments much appreciated. I want to provide a minimally correct
>>> example of simple client code somewhere (GitHub, blog post, something...)
>>> so I don't want to post this without review.
>>>
>>> Thanks,
>>> Vanessa
>>>
>>> ThoughtWire Corporation
>>> http://www.thoughtwire.com
>>>
>>>
>>>
>>>
>>> On Thu, Oct 8, 2015 at 8:45 AM, Dmitri Zagidulin <dzagidulin at basho.com>
>>> wrote:
>>>
>>>> Hi Vanessa,
>>>>
>>>> The thing to keep in mind about read repair is -- it happens
>>>> asynchronously on every GET, but /after/ the results are returned to the
>>>> client.
>>>>
>>>> So, when you issue a GET with r=1, the coordinating node only waits for
>>>> 1 of the replicas before responding to the client with a success, and only
>>>> afterwards triggers read-repair.
>>>>
>>>> It's true that with notfound_ok=false, it'll wait for the first
>>>> non-missing replica before responding. But if you edit or update your
>>>> objects at all, an R=1 still gives you a risk of stale values being
>>>> returned.
>>>>
>>>> For example, say you write an object with value A.  And let's say your
>>>> 3 replicas now look like this:
>>>>
>>>> replica 1: A,  replica 2: A, replica 3: notfound/missing
>>>>
>>>> A read with an R=1 and notfound_ok=false is just fine, here. (Chances
>>>> are, the notfound replica will arrive first, but the notfound_ok setting
>>>> will force the coordinator to wait for the first non-empty value, A, and
>>>> return it to the client. And then trigger read-repair).
>>>>
>>>> But what happens if you edit that same object, and give it a new value,
>>>> B?  So, now, there's a chance that your replicas will look like this:
>>>>
>>>> replica 1: A, replica 2: B, replica 3: B.
>>>>
>>>> So now if you do a read with an R=1, there's a chance that replica 1,
>>>> with the old value of A, will arrive first, and that's the response that
>>>> will be returned to the client.
>>>>
>>>> Whereas, using R=2 eliminates that risk -- well, at least decreases it.
>>>> You still have the issue of -- how does Riak decide whether A or B is the
>>>> correct value? Are you using causal context/vclocks correctly? (That is,
>>>> reading the object before you update, to get the correct causal context?)
>>>> Or are you relying on timestamps? (This is an ok strategy, provided that
>>>> the edits are sufficiently far apart in time, and you don't have many
>>>> concurrent edits, AND you're ok with the small risk of occasionally the
>>>> timestamp being wrong). You can use the following strategies to prevent
>>>> stale values, in increasing order of security/preference:
>>>>
>>>> 1) Use timestamps (and not pass in vector clocks/causal context). This
>>>> is ok if you're not editing objects, or you're ok with a bit of risk of
>>>> stale values.
>>>>
>>>> 2) Use causal context correctly (which means, read-before-you-write --
>>>> in fact, the Update operation in the java client does this for you, I
>>>> think). And if Riak can't determine which version is correct, it will fall
>>>> back on timestamps.
>>>>
>>>> 3) Turn on siblings, for that bucket or bucket type.  That way, Riak
>>>> will still try to use causal context to decide the right value. But if it
>>>> can't decide, it will store BOTH values, and give them back to you on the
>>>> next read, so that your application can decide which is the correct one.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Oct 8, 2015 at 1:56 AM, Vanessa Williams <
>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>
>>>>> Hi Dmitri, what would be the benefit of r=2, exactly? It isn't
>>>>> necessary to trigger read-repair, is it? If it's important I'd rather try
>>>>> it sooner than later...
>>>>>
>>>>> Regards,
>>>>> Vanessa
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 7, 2015 at 4:02 PM, Dmitri Zagidulin <dzagidulin at basho.com
>>>>> > wrote:
>>>>>
>>>>>> Glad you sorted it out!
>>>>>>
>>>>>> (I do want to encourage you to bump your R setting to at least 2,
>>>>>> though. Run some tests -- I think you'll find that the difference in speed
>>>>>> will not be noticeable, but you do get a lot more data resilience with 2.)
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 6:24 PM, Vanessa Williams <
>>>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>>>
>>>>>>> Hi Dmitri, well...we solved our problem to our satisfaction but it
>>>>>>> turned out to be something unexpected.
>>>>>>>
>>>>>>> The keys were two properties mentioned in a blog post on
>>>>>>> "configuring Riak’s oft-subtle behavioral characteristics":
>>>>>>> http://basho.com/posts/technical/riaks-config-behaviors-part-4/
>>>>>>>
>>>>>>> notfound_ok= false
>>>>>>> basic_quorum=true
>>>>>>>
>>>>>>> The 2nd one just makes things a little faster, but the first one is
>>>>>>> the one whose default value of true was killing us.
>>>>>>>
>>>>>>> With r=1 and notfound_ok=true (default) the first node to respond,
>>>>>>> if it didn't find the requested key, the authoritative answer was "this key
>>>>>>> is not found". Not what we were expecting at all.
>>>>>>>
>>>>>>> With the changed settings, it will wait for a quorum of responses
>>>>>>> and only if *no one* finds the key will "not found" be returned. Perfect.
>>>>>>> (Without this setting it would wait for all responses, not ideal.)
>>>>>>>
>>>>>>> Now there is only one snag, which is that if the Riak node the
>>>>>>> client connects to goes down, there will be no communication and we have a
>>>>>>> problem. This is easily solvable with a load-balancer, though for
>>>>>>> complicated reasons we actually don't need to do that right now. It's just
>>>>>>> acceptable for us temporarily. Later, we'll get the load-balancer working
>>>>>>> and even that won't be a problem.
>>>>>>>
>>>>>>> I *think* we're ok now. Thanks for your help!
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vanessa
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 7, 2015 at 9:33 AM, Dmitri Zagidulin <
>>>>>>> dzagidulin at basho.com> wrote:
>>>>>>>
>>>>>>>> Yeah, definitely find out what the sysadmin's experience was, with
>>>>>>>> the load balancer. It could have just been a wrong configuration or
>>>>>>>> something.
>>>>>>>>
>>>>>>>> And yes, that's the documentation page I recommend -
>>>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/
>>>>>>>> Just set up HAProxy, and point your Java clients to its IP.
>>>>>>>>
>>>>>>>> The drawbacks to load-balancing on the java client side (yes, the
>>>>>>>> cluster object) instead of a standalone load balancer like HAProxy, are the
>>>>>>>> following:
>>>>>>>>
>>>>>>>> 1) Adding node means code changes (or at very least, config file
>>>>>>>> changes) rolled out to all your clients. Which turns out to be a pretty
>>>>>>>> serious hassle. Instead, HAProxy allows you to add or remove nodes without
>>>>>>>> changing any java code or config files.
>>>>>>>>
>>>>>>>> 2) Performance. We've ran many tests to compare performance, and
>>>>>>>> client-side load balancing results in significantly lower throughput than
>>>>>>>> you'd have using haproxy (or nginx). (Specifically, you actually want to
>>>>>>>> use the 'leastconn' load balancing algorithm with HAProxy, instead of round
>>>>>>>> robin).
>>>>>>>>
>>>>>>>> 3) The health check on the client side (so that the java load
>>>>>>>> balancer can tell when a remote node is down) is much less intelligent than
>>>>>>>> a dedicated load balancer would provide. With something like HAProxy, you
>>>>>>>> should be able to take down nodes with no ill effects for the client code.
>>>>>>>>
>>>>>>>> Now, if you load balance on the client side and you take a node
>>>>>>>> down, it's not supposed to stop working completely. (I'm not sure why it's
>>>>>>>> failing for you, we can investigate, but it'll be easier to just use a load
>>>>>>>> balancer). It should throw an error or two, but then start working again
>>>>>>>> (on the retry).
>>>>>>>>
>>>>>>>> Dmitri
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 2:45 PM, Vanessa Williams <
>>>>>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>>>>>
>>>>>>>>> Hi Dmitri, thanks for the quick reply.
>>>>>>>>>
>>>>>>>>> It was actually our sysadmin who tried the load balancer approach
>>>>>>>>> and had no success, late last evening. However I haven't discussed the gory
>>>>>>>>> details with him yet. The failure he saw was at the application level (i.e.
>>>>>>>>> failure to read a key), but I don't know a) how he set up the LB or b) what
>>>>>>>>> the Java exception was, if any. I'll find that out in an hour or two and
>>>>>>>>> report back.
>>>>>>>>>
>>>>>>>>> I did find this article just now:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/
>>>>>>>>>
>>>>>>>>> So I suppose we'll give those suggestions a try this morning.
>>>>>>>>>
>>>>>>>>> What is the drawback to having the client connect to all 4 nodes
>>>>>>>>> (the cluster client, I assume you mean?) My understanding from reading
>>>>>>>>> articles I've found is that one of the nodes going away causes that client
>>>>>>>>> to fail as well. Is that what you mean, or are there other drawbacks as
>>>>>>>>> well?
>>>>>>>>>
>>>>>>>>> If there's anything else you can recommend, or links other than
>>>>>>>>> the one above you can point me to, it would be much appreciated. We expect
>>>>>>>>> both node failure and deliberate node removal for upgrade, repair,
>>>>>>>>> replacement, etc.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vanessa
>>>>>>>>>
>>>>>>>>> On Wed, Oct 7, 2015 at 8:29 AM, Dmitri Zagidulin <
>>>>>>>>> dzagidulin at basho.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Vanessa,
>>>>>>>>>>
>>>>>>>>>> Riak is definitely meant to run behind a load balancer. (Or, at
>>>>>>>>>> the worst case, to be load-balanced on the client side. That is, all
>>>>>>>>>> clients connect to all 4 nodes).
>>>>>>>>>>
>>>>>>>>>> When you say "we did try putting all 4 Riak nodes behind a
>>>>>>>>>> load-balancer and pointing the clients at it, but it didn't help." -- what
>>>>>>>>>> do you mean exactly, by "it didn't help"? What happened when you tried
>>>>>>>>>> using the load balancer?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 7, 2015 at 1:57 PM, Vanessa Williams <
>>>>>>>>>> vanessa.williams at thoughtwire.ca> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all, we are still (for a while longer) using Riak 1.4 and the
>>>>>>>>>>> matching Java client. The client(s) connect to one node in the cluster
>>>>>>>>>>> (since that's all it can do in this client version). The cluster itself has
>>>>>>>>>>> 4 nodes (sorry, we can't use 5 in this scenario). There are 2 separate
>>>>>>>>>>> clients.
>>>>>>>>>>>
>>>>>>>>>>> We've tried both n_val = 3 and n_val=4. We achieve
>>>>>>>>>>> consistency-by-writes by setting w=all. Therefore, we only require one
>>>>>>>>>>> successful read (r=1).
>>>>>>>>>>>
>>>>>>>>>>> When all nodes are up, everything is fine. If one node fails,
>>>>>>>>>>> the clients can no longer read any keys at all. There's an exception like
>>>>>>>>>>> this:
>>>>>>>>>>>
>>>>>>>>>>> com.basho.riak.client.RiakRetryFailedException:
>>>>>>>>>>> java.net.ConnectException: Connection refused
>>>>>>>>>>>
>>>>>>>>>>> Now, it isn't possible that Riak can't operate when one node
>>>>>>>>>>> fails, so we're clearly missing something here.
>>>>>>>>>>>
>>>>>>>>>>> Note: we did try putting all 4 Riak nodes behind a load-balancer
>>>>>>>>>>> and pointing the clients at it, but it didn't help.
>>>>>>>>>>>
>>>>>>>>>>> Riak is a high-availability key-value store, so... why are we
>>>>>>>>>>> failing to achieve high-availability? Any suggestions greatly appreciated,
>>>>>>>>>>> and if more info is required I'll do my best to provide it.
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>> Vanessa
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Vanessa Williams
>>>>>>>>>>> ThoughtWire Corporation
>>>>>>>>>>> http://www.thoughtwire.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> riak-users mailing list
>>>>>>>>>>> riak-users at lists.basho.com
>>>>>>>>>>>
>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160222/ebfc5aa9/attachment-0002.html>


More information about the riak-users mailing list