Interesting problem with riaksearch indexes
francisco.treacy at gmail.com
Tue Aug 23 06:53:00 EDT 2011
Hmmm, definitely not great... but thanks Ryan for the explanation
2011/8/23 Ryan Zezeski <rzezeski at basho.com>
> The reason for the nondeterministic behavior is two-fold.
> 1. For performance reasons Search only ever reads from 1 node (R=1)
> 2. As an attempt to balance load and reduce vnode contention this node is
> selected randomly
> This is why it works 50% of the time. Because now, for each index entry, 2
> partitions have the data and 1 does not. So depending on which one you hit
> you'll get the data or not. Furthermore, this behavior will continue until
> you reindex because the index in Search has no form of anti-entropy such as
> read repair or merkle trees.
> In the future the easiest thing is to replace that lost node as quickly as
> possible. While it's down the other nodes will keep track of the new index
> entries and will transfer them during data handoff when the node comes alive
> again. By removing the node you've changed the ring and your only option is
> to reindex as you are already doing. I realize that bringing that node up
> or replacing it may not have been an option but this is the only way to
> avoid this problem with Search as it stands today.
> I realize this sucks and isn't in line with Riak's more fault tolerant
> behavior. It does suck. I hate the fact that I have to write this email
> basically telling you this part of Search is broken, IMO. I want to see it
> addressed and I'm sure I'm not the only one. Right now our internal ticket
> board is buzzing in anticipation for the new release. After that there is a
> lot of love I want to give Search, this particular issue included. I'd say
> it's only a matter of time.
> On Fri, Aug 19, 2011 at 2:46 PM, Gordon Tillman <gtillman at mezeo.com>wrote:
>> Greetings all,
>> After an extended datacenter power outage, a 3-node Riak cluster shut
>> down. When the power was restored, two of the three nodes came back up.
>> Don't know what is going on with the third node. But in the mean time, have
>> removed the dead node from the ring. The two remaining nodes show a good
>> ringready status.
>> The problem is that the search indexes appear to be in an inconsistent
>> state. For example, I can issue the same solr query on one of the nodes and
>> 50% of the time it returns correct results. The other times it returns an
>> empty result set.
>> I'm in the process of re-indexing the bucket in question (a very
>> time-consuming affair). But I wonder if anyone could shed some light on
>> this situation as to why it occurred in the first place and if there is
>> anything that can be done to keep this from happening again in the future.
>> Many thanks,
>> riak-users mailing list
>> riak-users at lists.basho.com
> riak-users mailing list
> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users