Achieving 100% consitency

Russell Brown russelldb at basho.com
Mon Aug 29 09:47:33 EDT 2011


I'm having trouble simply posting a response to this list. This is my 3rd attempt so if anyone is being spammed, I'm sorry, but I'm just not seeing the replies and I keep getting rejection notices from the list manager…

Here we go again:

Hi Lukas,

Sorry it has taken me so long to join this party, I have been away and I'm just catching up.

On 29 Aug 2011, at 09:05, Lukas Schulze wrote:

> A friend of mine found out how it could work: I have to delete the entry first and after storing it in the database I've to check the result.
> It's not the prettiest code I've written before, but without the while-loop it will work for nearly 97% of my tuples. With the while-loop everything works fine.
> 
> ======================================
> RiakObject riakIndex = new RiakObject(attrName, attrValue, indexString.getBytes());
> riakIndex.setContentType("text/plain; charset=UTF-8");
> try{
> 	//have to be deleted because of cache (?)

No, there is no cache.

> 	riakPBCClient.delete(attrName, attrValue);
> 	riakPBCClient.store(riakIndex);
> 	RiakObject[] fetched = riakPBCClient.fetch(attrName, attrValue);
> 	//check whether the entry is correctly stored in the database
> 	while(fetched.length == 0) {//try it until it works...
> 		riakPBCClient.store(riakIndex);
> 		fetched = riakPBCClient.fetch(attrName, attrValue);
> 	}
> 	

I don't understand what you're doing here? If you want the object you stored, why not just returnBody=true for the store operation?

> 	//fetched entry doesn't match our stored one
> 	if(!riakIndex.getValue().equals(fetched[0].getValue())) {
> 		System.err.println("index match: failed -> " + attrName + "." + attrValue);
> 	}

You really need to do this with n_val, r, w, dw all = 1 and allow_mult=false and no other clients? Something is wrong. Please let me know what version of the RJC you are using and what version of Riak on what erlang/OS/arch etc. I'd like to try this at home as storing data and retrieving is what we are all about and this should *just work* (we have integration tests like this and they *DO* work.) 

Forgive me if I am teaching you to suck eggs here, but right at the start there, when you store the "riakIndex" object, is there a chance it already exists? 
Does the bucket have allow_mult set to false? 
Do you fetch before your store (if you use the new RJC it does a fetch before a store to handle the vector clock for you)? 
Are you attempting to overwrite an existing value but providing a stale vclock? 
If you have allow_mult set to false and try and write your object with a stale vclock your write will be silently dropped. Likewise, if you provide *no* vclock but a clientId that already has a write on that key, your write will be dropped. I realise this can be confusing at first, but does it explain the behaviour you're seeing?

For fun, try this:

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(404)

curl -v -X PUT http://127.0.0.1:8098/riak/b/k -d"a"

(204)

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(200) a

curl -v -X PUT http://127.0.0.1:8098/riak/b/k -d"b"

curl -v -X GET http://127.0.0.1:8098/riak/b/k

(200) b

BUT do the same thing with a clientId header set

curl -v -X GET http://127.0.0.1:8098/riak/c/k -H"X-Riak-ClientId: pete"

(404)

curl -v -X PUT http://127.0.0.1:8098/riak/c/k -d"a" -H"X-Riak-ClientId: pete"

(204)

curl -v -X GET http://127.0.0.1:8098/riak/c/k -H"X-Riak-ClientId: pete"

(200) a

curl -v -X PUT http://127.0.0.1:8098/riak/c/k -d"b" -H"X-Riak-ClientId: pete"

curl -v -X GET http://127.0.0.1:8098/riak/c/k

(200) a (HUH?)

Does that explain any of what you see?

A well behaved client will always fetch before store, and will use the vlcock/clientId. The new RJC does this for you behind the scenes. See http://blog.basho.com/2011/07/14/The-All-New-Riak-Java-Client/ and https://github.com/basho/riak-java-client/blob/master/README.org for more details.

What is your use case? When you go into production will you have multiple nodes? Will you have allow_mult set to true? Let me know what I can do to help, 'cos it is great to see someone else using the Java client and I want to make it easy for you to do so. 

It would be far simpler to start defining you strategy for resolving conflicting writes/sibling values than it would be to try and acheive 100% consistency in a distributed, fault tolerant database.

Cheers

Russell

> 	
> }
> ======================================
> 
> Best regards
> Lukas
> 
> 
> On Mon, Aug 29, 2011 at 9:20 AM, Lukas Schulze <info at lukas-schulze.de> wrote:
> Hi,
> 
> thank you for your answers.
> I know that Riak is designed for running on distributed servers.
> But what's about adding lots of data and every tuple depends on another one?
> I thought that having only 1 node and disabling replications could solve my problems of getting always the latest data from Riak.
> 
> Is there another way to achieve 100% consistency in a riak database after a very short time?
> 
> Best regards
> Lukas
> 
> 
> 
> On Sat, Aug 27, 2011 at 5:43 PM, Ian Plosker <ian at basho.com> wrote:
> Jonathan,
> 
> Excuse me, that last message should have been addressed to you.
> 
> Ian Plosker
> Developer Advocate
> Basho Technologies
> 
> 
> On Aug 27, 2011, at 11:39 AM, Ian Plosker wrote:
> 
>> Lukas,
>> 
>> Yes, even for dev you'd be best advised to develop and test your application with the same or similar number of nodes and n, r, and w settings as you would in production. It's good practice to develop applications in a dev/test environment that mirrors the production environment as much as is reasonable/feasible. You can run a single node cluster, but note that this isn't a configuration you'll see in a production.
>> 
>> Ian Plosker
>> Developer Advocate
>> Basho Technologies
>> 
>> 
>> 
>> On Aug 27, 2011, at 5:33 AM, Jonathan Langevin wrote:
>> 
>>> Even for development-purposes only? Otherwise it seems data would be written n times to the same machine, which is needless in a dev environment with low storage specs...
>>> 
>>> 
>>> Jonathan Langevin
>>> Systems Administrator
>>> Loom Inc.
>>> Wilmington, NC: (910) 241-0433 - jlangevin at loomlearning.com - www.loomlearning.com - Skype: intel352
>>> 
>>> 
>>> 
>>> On Fri, Aug 26, 2011 at 5:01 PM, Ian Plosker <ian at basho.com> wrote:
>>> Lukas,
>>> 
>>> Also, we don't advise that you run single node clusters. Riak is designed to be used in clusters of at least 3 nodes. You can run a multi-node cluster on a single development machine by downloading the Riak source, and running "make devrel". Take a look at the Riak Fast Track (http://wiki.basho.com/The-Riak-Fast-Track.html) for more details.
>>> 
>>> Ian Plosker
>>> Developer Advocate
>>> Basho Technologies
>>> 
>>> On Aug 26, 2011, at 3:17 PM, Lukas Schulze wrote:
>>> 
>>>> I'm doing some simple tests with Riak and tried to build something like an index.
>>>> Therefore I created new buckets for some attributes like "name", "street" and "city".
>>>> One entry in the index-bucket "name" is for example "Mueller" and the value contains all user ids, formatted as an JSON string: "{id:[1,5,8,13,2,7]}"
>>>> The java objects are saved as JSON strings in a separate bucket "users", the keys in this bucket are the user-ids, the values are the JSON strings.
>>>> 
>>>> If I add 200 users via Java and the RiakPBC client every loop I fetch the index, add the new user id and store it again in Riak.
>>>> But java is too fast, so I receive an old version of the bucket.
>>>> 
>>>> Because I've only one node I set the n-value to 1, r = 1, w = 1 and dw = 1.
>>>> But I have to wait nearly 2 seconds to be mostly sure to get the correct response. (the computer isn't an high-end machine ;-) )
>>>> 
>>>> Is it possible to be sure that the data will be saved permanently and I can continue adding users?
>>>> Are there any caching methods I can configure?
>>>> Can I set the default n-value to 1 so that every newly created bucket will have this value?
>>>> Does Riak have any kind of indexes or is it possible to implement it a better way?
>>>> 
>>>> In my first version I saved all users in one bucket and iterated over all of them to find the correct one. But for every single request from the Java Service to Riak it took nearly 200ms. For a huge amount of entries (10,000) this isn't practible. Therefore I tried to implement my own indexes.
>>>> 
>>>> The main focus of my question is getting rid of the inconsistent reads.
>>>> 
>>>> Thank you.
>>>> 
>>>> Best Regards
>>>> Lukas
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110829/24729b20/attachment.html>


More information about the riak-users mailing list