Yokozuna inconsistent search results

Magnus Kessler mkessler at basho.com
Mon Mar 14 11:20:11 EDT 2016

Hi Oleksiy,

Would you mind sharing the output of 'riak-debug' from all nodes? You can
upload the files to a location of your choice and PM me the details. As far
as we are aware we have fixed all previously existing issues that would
prevent a full YZ AAE tree rebuild from succeeding when non-indexable data
was present. However, the logs may still contain hints that may help us to
identify the root cause of your issue.

Many Thanks,


On 14 March 2016 at 09:45, Oleksiy Krivoshey <oleksiyk at gmail.com> wrote:

> I would like to continue as this seems to me like a serious problem, on a
> bucket with 700,000 keys the difference in num_found can be up to 200,000!
> And thats a search index that doesn't index, analyse or store ANY of the
> document fields, the schema has only required _yz_* fields and nothing else.
> I have tried deleting the search index (with PBC call) and tried expiring
> AAE trees. Nothing helps. I can't get consistent search results from
> Yokozuna.
> Please help.
> On 11 March 2016 at 18:18, Oleksiy Krivoshey <oleksiyk at gmail.com> wrote:
>> Hi Fred,
>> This is production environment but I can delete the index. However this
>> index covers ~3500 buckets and there are probably 10,000,000 keys.
>> The index was created after the buckets. The schema for the index is just
>> the basic required fields (_yz_*) and nothing else.
>> Yes, I'm willing to resolve this. When you say to delete chunks_index, do
>> you mean the simple RpbYokozunaIndexDeleteReq or something else is required?
>> Thanks!
>> On 11 March 2016 at 17:08, Fred Dushin <fdushin at basho.com> wrote:
>>> Hi Oleksiy,
>>> This is definitely pointing to an issue either in the coverage plan
>>> (which determines the distributed query you are seeing) or in the data you
>>> have in Solr.  I am wondering if it is possible that you have some data in
>>> Solr that is causing the rebuild of the YZ AAE tree to incorrectly
>>> represent what is actually stored in Solr.
>>> What you did was to manually expire the YZ (Riak Search) AAE trees,
>>> which caused them to rebuild from the entropy data stored in Solr.  Another
>>> thing we could try (if you are willing) would be to delete the
>>> 'chunks_index' data in Solr (as well as the Yokozuna AAE data), and then
>>> let AAE repair the missing data.  What Riak will essentially do is compare
>>> the KV hash trees with the YZ hash trees (which will be empty), too that it
>>> is missing in Solr, and add it to Solr, as a result.  This would
>>> effectively result in re-indexing all of your data, but we are only talking
>>> about ~30k entries (times 3, presumably, if your n_val is 3), so that
>>> shouldn't take too much time, I wouldn't think.  There is even some
>>> configuration you can use to accelerate this process, if necessary.
>>> Is that something you would be willing to try?  It would result in down
>>> time on query.  Is this production data or a test environment?
>>> -Fred
>>> --
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160314/05f2fcb1/attachment.html>

More information about the riak-users mailing list