'not found' after join

Greg Nelson grourk at dropcam.com
Mon May 2 23:36:17 EDT 2011


Ok, thank you Ryan. I'm glad that what I'm seeing is expected behavior, although it is a little surprising.

As Kyle asked in a parallel reply on this thread: is it possible for the client to distinguish between these different scenarios where a notfound can be returned? My initial thought is that when a node is being added, clients can do reads with r=1 and retry GETs that 404...?
On Monday, May 2, 2011 at 8:14 PM, Ryan Zezeski wrote: 
> Greg,
> 
> Your expectations are fair, just because you added a node doesn't mean Riak should return notfounds. Unfortunately, we aren't quite there yet. This is a side effect of how Riak currently implements handoff in that it immediately updates/gossips the ring causing many partitions to handoff immediately. If a request comes in that relies on these partitions then it will get a notfound and perform read repair. You're situation is multiplied by the fact that you are going from 3 nodes to 4. More vnode shuffling occurs because of the small cluster size. 
> 
> We're well aware of this and have it on our radar for improvement in a future release.
> 
> All this said, you data will be eventually consistent. That is, all your data will eventually be handed off and things will work as normal. It's only during the handoff that you _may_ encounter notfounds. In this case it would be best to add a new node to your cluster at lowest load times and if you can spare additional hardware a few more nodes to start with is an even easier option. 
> 
> -Ryan
> 
> On Mon, May 2, 2011 at 9:48 PM, Greg Nelson <grourk at dropcam.com> wrote:
> > Hello riak users! 
> > 
> > I have a 4 node cluster that started out as 3 nodes. ring_creation_size = 2048, target_n_val is default (4), and all buckets have n_val = 3. 
> > 
> > When I joined the 4th node, for a few minutes some GETs were returning 'not found' for data that was already in riak. Eventually the data was returned, due to read repair I would assume. Is this expected? It seems that 'not found' and read repairs should only happen when something goes wrong, like a node goes down. Not when adding a node to the cluster, which is supposed to be part of normal operation! 
> > 
> > Any help or insight is appreciated!
> > 
> > Greg 
> > _______________________________________________
> >  riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110502/aaed0401/attachment.html>


More information about the riak-users mailing list