Yokozuna inconsistent search results

Magnus Kessler mkessler at basho.com
Thu Mar 24 07:28:11 EDT 2016


Hi Oleksiy,

As a first step, I suggest to simply expire the Yokozuna AAE trees again if
the output of `riak-admin search aae-status` still suggests that no recent
exchanges have taken place. To do this, run `riak attach` on one node and
then

riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000).


Exit from the riak console with `Ctrl+G q`.

Depending on your settings and amount of data the full index should be
rebuilt within the next 2.5 days (for a cluster with ring size 128 and
default settings). You can monitor the progress with `riak-admin search
aae-status` and also in the logs, which should have messages along the
lines of

2016-03-24 10:28:25.372 [info]
<0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during
active anti-entropy exchange of partition
1210306043414653979137426502093171875652569137152 for preflist
{1164634117248063262943561351070788031288321245184,3}


Re-indexing can put additional strain on the cluster and may cause elevated
latency on a cluster already under heavy load. Please monitor the response
times while the cluster is re-indexing data.

If the cluster load allows it, you can force more rapid re-indexing by
changing a few parameters. Again at the `riak attach` console, run

riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna,
anti_entropy_build_limit, {4, 60000}], 5000).
riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna,
anti_entropy_concurrency, 5], 5000).

This will allow up to 4 trees per node to be built/exchanged per hour, with
up to 5 concurrent exchanges throughout the cluster. To return back to the
default settings, use

riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna,
anti_entropy_build_limit, {1, 360000}], 5000).
riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna,
anti_entropy_concurrency, 2], 5000).


If the cluster still doesn't make any progress with automatically
re-indexing data, the next steps are pretty much what you already
suggested, to drop the existing index and re-index from scratch. I'm
assuming that losing the indexes temporarily is acceptable to you at this
point.

Using any client API that supports RpbYokozunaIndexDeleteReq, you can drop
the index from all Solr instances, losing any data stored there
immediately. Next, you'll have to re-create the index. I have tried this
with the python API, where I deleted the index and re-created it with the
same already uploaded schema:

from riak import RiakClient

c = RiakClient()
c.delete_search_index('my_index')
c.create_search_index('my_index', 'my_schema')

Note that simply deleting the index does not remove it's existing
association with any bucket or bucket type. Any PUT operations on these
buckets will lead to indexing failures being logged until the index has
been recreated. However, this also means that no separate operation in
`riak-admin` is required to associate the newly recreated index with the
buckets again.

After recreating the index expire the trees as explained previously.

Let us know if this solves your issue.

Kind Regards,

Magnus


On 24 March 2016 at 08:44, Oleksiy Krivoshey <oleksiyk at gmail.com> wrote:

> This is how things are looking after two weeks:
>
> - there are no solr indexing issues for a long period (2 weeks)
> - there are no yokozuna errors at all for 2 weeks
> - there is an index with all empty schema, just _yz_* fields, objects
> stored in a bucket(s) are binary and so are not analysed by yokozuna
> - same yokozuna query repeated gives different number for num_found,
> typically the difference between real number of keys in a bucket and
> num_found is about 25%
> - number of keys repaired by AAE (according to logs) is about 1-2 per few
> hours (number of keys "missing" in index is close to 1,000,000)
>
> Should I now try to delete the index and yokozuna AAE data and wait
> another 2 weeks? If yes - how should I delete the index and AAE data?
> Will RpbYokozunaIndexDeleteReq be enough?
>
>
>
-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160324/eb972ba7/attachment-0002.html>


More information about the riak-users mailing list