Yokozuna inconsistent search results
mkessler at basho.com
Mon Mar 14 11:20:11 EDT 2016
Would you mind sharing the output of 'riak-debug' from all nodes? You can
upload the files to a location of your choice and PM me the details. As far
as we are aware we have fixed all previously existing issues that would
prevent a full YZ AAE tree rebuild from succeeding when non-indexable data
was present. However, the logs may still contain hints that may help us to
identify the root cause of your issue.
On 14 March 2016 at 09:45, Oleksiy Krivoshey <oleksiyk at gmail.com> wrote:
> I would like to continue as this seems to me like a serious problem, on a
> bucket with 700,000 keys the difference in num_found can be up to 200,000!
> And thats a search index that doesn't index, analyse or store ANY of the
> document fields, the schema has only required _yz_* fields and nothing else.
> I have tried deleting the search index (with PBC call) and tried expiring
> AAE trees. Nothing helps. I can't get consistent search results from
> Please help.
> On 11 March 2016 at 18:18, Oleksiy Krivoshey <oleksiyk at gmail.com> wrote:
>> Hi Fred,
>> This is production environment but I can delete the index. However this
>> index covers ~3500 buckets and there are probably 10,000,000 keys.
>> The index was created after the buckets. The schema for the index is just
>> the basic required fields (_yz_*) and nothing else.
>> Yes, I'm willing to resolve this. When you say to delete chunks_index, do
>> you mean the simple RpbYokozunaIndexDeleteReq or something else is required?
>> On 11 March 2016 at 17:08, Fred Dushin <fdushin at basho.com> wrote:
>>> Hi Oleksiy,
>>> This is definitely pointing to an issue either in the coverage plan
>>> (which determines the distributed query you are seeing) or in the data you
>>> have in Solr. I am wondering if it is possible that you have some data in
>>> Solr that is causing the rebuild of the YZ AAE tree to incorrectly
>>> represent what is actually stored in Solr.
>>> What you did was to manually expire the YZ (Riak Search) AAE trees,
>>> which caused them to rebuild from the entropy data stored in Solr. Another
>>> thing we could try (if you are willing) would be to delete the
>>> 'chunks_index' data in Solr (as well as the Yokozuna AAE data), and then
>>> let AAE repair the missing data. What Riak will essentially do is compare
>>> the KV hash trees with the YZ hash trees (which will be empty), too that it
>>> is missing in Solr, and add it to Solr, as a result. This would
>>> effectively result in re-indexing all of your data, but we are only talking
>>> about ~30k entries (times 3, presumably, if your n_val is 3), so that
>>> shouldn't take too much time, I wouldn't think. There is even some
>>> configuration you can use to accelerate this process, if necessary.
>>> Is that something you would be willing to try? It would result in down
>>> time on query. Is this production data or a test environment?
Client Services Engineer
Basho Technologies Limited
Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users