Different numFound request to riak search

Roma Lakotko roman at lakotko.ru
Fri Mar 13 03:23:27 EDT 2015


Hello.

Good news. After i remove folder, and start yz_entropy_mgr:init([]) and
then re-save objects it's start to return correct numFound (test on 2
buckets, about 6000 requests). This is tested on dev instance.

So, as i understand, i need to make same operation on production cluster on
each node?

And, is this bug already patched? Or in future i can have same problem
after add/remove nodes?

Thanks,
Roman Lakotko.


2015-03-12 18:56 GMT+03:00 Roma Lakotko <roman at lakotko.ru>:

> No, its simple riak search http request.
> 12 марта 2015 г. 18:54 пользователь "Zeeshan Lakhani" <zlakhani at basho.com>
> написал:
>
> Are you running mapreduce with Solr queries?
>>
>>
>> On Mar 12, 2015, at 11:50 AM, Roma Lakotko <roman at lakotko.ru> wrote:
>>
>> I don't see any solr errors. But each 10-20 minutes on prod and once a
>> day on dev i see strange errors:
>>
>> 2015-03-11 09:18:10.668 [error] <0.234.0> Supervisor
>> riak_pipe_fitting_sup had child undefined started with
>> riak_pipe_fitting:start_link() at <0.12060.2> exit with reason noproc in
>> context shutdown_error
>> 2015-03-12 13:12:05.200 [error] <0.379.0> Supervisor riak_kv_mrc_sink_sup
>> had child undefined started with riak_kv_mrc_sink:start_link() at
>> <0.6601.1> exit with reason noproc in context shutdown_error
>>
>> For both prod and dev instance values are:
>>
>> anti_entropy_build_limit  -> {ok,{1,3600000}}
>> anti_entropy_concurrency -> {ok,2}
>> anti_entropy_tick - > undefined
>>
>> I delete data folder and run init method, i'll  results after it rebuild
>> trees.
>>
>> 2015-03-12 18:22 GMT+03:00 Zeeshan Lakhani <zlakhani at basho.com>:
>>
>>> Are you noticing any Solr errors in the logs?
>>>
>>> For your container instance, you can attempt to clear the AAE trees and
>>> force a rebuild by removing the entropy directories in `./data/yz_anti_entropy`
>>> and running `yz_entropy_mgr:init([])` via `riak attach`.  Or, you can
>>> let AAE occur naturally (after removing the entropy data) and up the
>>> concurrency/build_limit/tick (using set_env). You can see what you’re
>>> current settings are by calling...
>>>
>>> ```
>>> riak_core_util:rpc_every_member_ann(application, get_env, [riak_kv,
>>> anti_entropy_build_limit],infinity).
>>> riak_core_util:rpc_every_member_ann(application, get_env, [riak_kv,
>>> anti_entropy_concurrency],infinity).
>>> riak_core_util:rpc_every_member_ann(application, get_env, [yokozuna,
>>> anti_entropy_tick],infinity).
>>> ```
>>>
>>> … on any of the nodes.  Query coverage is R=1, but the values should be
>>> replicated across.
>>>
>>> Thanks.
>>>
>>>
>>> On Mar 12, 2015, at 9:51 AM, Roma Lakotko <roman at lakotko.ru> wrote:
>>>
>>> Hello Zeeshan.
>>>
>>> While i run queries no delete object is occurs.
>>>
>>> Stats on production and developer nodes output something like this:
>>> https://gist.github.com/romulka/d0254aa193a9dbb52b67
>>>
>>> On dev container:
>>>
>>> /etc/riak# grep anti_entropy *
>>> riak.conf:anti_entropy = active
>>> riak.conf.dpkg-dist:anti_entropy = active
>>>
>>> ll -h /var/lib/riak/yz_anti_entropy/
>>> total 264K
>>> drwxrwxr-x 66 riak riak 4.0K Sep 25 12:08 ./
>>> drwxr-xr-x 12 riak riak 4.0K Dec  9 12:19 ../
>>> drwxr-xr-x  9 riak riak 4.0K Mar 12 12:01 0/
>>> drwxr-xr-x  9 riak riak 4.0K Mar 12 12:01
>>> 1004782375664995756265033322492444576013453623296/
>>> drwxr-xr-x  9 riak riak 4.0K Mar 12 12:01
>>> 1027618338748291114361965898003636498195577569280/
>>> ....
>>>
>>> On prod:
>>>
>>> grep anti_entropy * /etc/riak/ -> empty
>>>
>>> root at riak-21:/var/lib/riak/yz_anti_entropy# ll -h
>>> total 64K
>>> drwxrwxr-x 16 riak riak 4.0K Dec  4 03:44 ./
>>> drwxr-xr-x 14 riak riak 4.0K Dec  9 12:10 ../
>>> drwxr-xr-x  9 riak riak 4.0K Dec  4 03:44 0/
>>> drwxr-xr-x  9 riak riak 4.0K Mar 12 12:57
>>> 1027618338748291114361965898003636498195577569280/
>>> ....
>>>
>>> I'm already try re-save all keys, it doesn't helps.
>>>
>>> Production cluster have 7 node, start from 3. So yes, nodes was
>>> added/delete sometimes.
>>>
>>> On dev, i have 1 instance in docker container, never added to cluster.
>>> But data in that riak is imported from production cluster a while ago.
>>>
>>> I can give you a copy of container, if you need to.
>>>
>>> Thanks,
>>> Ronan Lakotko
>>>
>>>
>>>
>>> 2015-03-12 16:36 GMT+03:00 Zeeshan Lakhani <zlakhani at basho.com>:
>>>
>>>> Hello Roma,
>>>>
>>>> Have you deleted this object at some point in your runs? Please make
>>>> sure AAE is running by checking search’s AAE status, `riak-admin search
>>>> aae-status`, and that data exists in the correct directory,
>>>> `./data/yz_anti_entropy` (
>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/search/). You
>>>> may just need to perform a read-repair by performing a fetch of the object
>>>> itself first, before performing search queries again.
>>>>
>>>> Also, have you left or added nodes? I’m guessing that  even your 1 node
>>>> instance is still running a cluster on that one node, right?
>>>>
>>>> Thanks.
>>>>
>>>> Zeeshan Lakhani
>>>> programmer |
>>>> software engineer at @basho |
>>>> org. member/founder of @papers_we_love | paperswelove.org
>>>> twitter => @zeeshanlakhani
>>>>
>>>> On Mar 12, 2015, at 5:59 AM, Roma Lakotko <roman at lakotko.ru> wrote:
>>>>
>>>> Each request to riak search return different results. It's return
>>>> different numFound.
>>>>
>>>> I use request like this:
>>>>
>>>>
>>>> http://localhost:8098/search/query/assets?wt=json&q=type:*&sort=_yz_rk%20asc
>>>>
>>>> If add start offset it can return:
>>>>
>>>>
>>>> http://localhost:8098/search/query/assets?wt=json&q=type:*&sort=_yz_rk%20asc&start=1247
>>>>
>>>> "response": {
>>>>         "numFound": 1248,
>>>>         "start": 1247,
>>>>         "docs": [
>>>>             {
>>>>                 "_yz_id": "1*default*assets*fff63ecf-a0c4-4ecf-b24d-c493ca3a302f*44",
>>>>                 "_yz_rk": "fff63ecf-a0c4-4ecf-b24d-c493ca3a302f",
>>>>                 "_yz_rt": "default",
>>>>                 "_yz_rb": "assets"
>>>>             }
>>>>         ]
>>>>     }
>>>>
>>>>
>>>> On next request it return something like this
>>>>
>>>>
>>>> "numFound": 1224,
>>>>         "start": 1247,
>>>>         "docs": []
>>>>
>>>>
>>>> I have 1 node installation, and no process write to Riak.
>>>>
>>>> I have same problem this production cluster with 7 nodes.
>>>>
>>>>
>>>> Scheme for document
>>>>
>>>>
>>>> <?xml version="1.0" encoding="UTF-8" ?>
>>>> <schema name="schedule" version="1.5">
>>>>  <fields>
>>>>    <field name="objectId"     type="string_ci"   indexed="true" stored="false" />
>>>>    <field name="type"     type="string_ci"   indexed="true" stored="false" />
>>>>    <field name="objectType"     type="string_ci"   indexed="true" stored="false" />
>>>>
>>>>    <field name="contentType"     type="string_ci"   indexed="true" stored="false" />
>>>>    <field name="properties"    type="string_ci"   indexed="true" stored="false" multiValued="true" />
>>>>    <field name="tag"     type="string_ci"   indexed="true" stored="false" />
>>>>    <field name="isUploaded"    type="boolean"     indexed="true" stored="false" />
>>>>    <field name="published"    type="boolean"     indexed="true" stored="false" />
>>>>    <field name="drm"    type="boolean"     indexed="true" stored="false" />
>>>>    <field name="dateCreated" type="date" indexed="true" stored="false" />
>>>>
>>>>    <!-- All of these fields are required by Riak Search -->
>>>>    <field name="_yz_id"   type="_yz_str" indexed="true" stored="true"  multiValued="false" required="true"/>
>>>>    <field name="_yz_ed"   type="_yz_str" indexed="true" stored="false" multiValued="false"/>
>>>>    <field name="_yz_pn"   type="_yz_str" indexed="true" stored="false" multiValued="false"/>
>>>>    <field name="_yz_fpn"  type="_yz_str" indexed="true" stored="false" multiValued="false"/>
>>>>    <field name="_yz_vtag" type="_yz_str" indexed="true" stored="false" multiValued="false"/>
>>>>    <field name="_yz_rk"   type="_yz_str" indexed="true" stored="true"  multiValued="false"/>
>>>>    <field name="_yz_rt"   type="_yz_str" indexed="true" stored="true"  multiValued="false"/>
>>>>    <field name="_yz_rb"   type="_yz_str" indexed="true" stored="true"  multiValued="false"/>
>>>>    <field name="_yz_err"  type="_yz_str" indexed="true" stored="false" multiValued="false"/>
>>>>
>>>>    <dynamicField name="*" type="ignored"/>
>>>>  </fields>
>>>>
>>>>  <uniqueKey>_yz_id</uniqueKey>
>>>>
>>>>  <types>
>>>>   <!-- YZ String: Used for non-analyzed fields text_ru -->
>>>>   <fieldType name="date" class="solr.TrieDateField" sortMissingLast="true" omitNorms="true"/>
>>>>   <fieldType name="double" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/>
>>>>   <fieldType name="int" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
>>>>
>>>>   <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
>>>>   <fieldType name="_yz_str" class="solr.StrField" sortMissingLast="true" />
>>>>   <fieldtype name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" />
>>>>   <fieldType name="string_ci" class="solr.TextField" sortMissingLast="true" omitNorms="true">
>>>>         <analyzer>
>>>>             <tokenizer class="solr.StandardTokenizerFactory"/>
>>>>             <filter class="solr.LowerCaseFilterFactory" />
>>>>             <filter class='solr.PatternReplaceFilterFactory' pattern='ё' replacement='е' replace='all'/>
>>>>         </analyzer>
>>>>     </fieldType>
>>>>   </types>
>>>>
>>>> </schema>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Roman
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150313/ca575085/attachment-0002.html>


More information about the riak-users mailing list