Yokozuna's inconsistent query problem

Fred Dushin fdushin at basho.com
Thu Feb 23 22:43:59 EST 2017


Hello Witeman,

What you are seeing with your two queries is the result of two different coverage plans, querying different parts of the cluster.  Riak Search translates coverage plans to Solr sharded queries, and will periodically change the coverage plan, so as to more evenly distribute queries across the cluster.  So what you are seeing is effectively two different sharded queries, hitting different solr instances in your cluster.  You are seeing inconsistent search results, which suggests there is a discrepancy between what is stored in Solr and what is stored in Riak.  Generally speaking, AAE should detect these repairs and correct them over time.

Can you please send the output of the following two commands:

riak-admin aae-status
riak-admin search aae-status

That will tell you something about the behavior of the underlying KV and Yokozuna AAE subsystems.

Out of curiosity, can you you read the "2772439" key in the "data_201702" bucket in Riak with quorum equal to all?  If you do that read, does that affect the behavior of your query you posted?  (I am wondering if you trigger read repair that will repair the entry in Solr)

Thanks,
-Fred

> On Feb 23, 2017, at 10:11 PM, Witeman Zheng <witeman.g at gmail.com> wrote:
> 
> Hi,
> 
> 
> I am having a 10 nodes of RiakKV 2.2.0, and turn on Riak Search(Yokozuna).  Having about 3million records in one bucket with index, every record has about 1k size.
> 
> Then when it is triggered a Yokozuna query for one specific id, sometimes return the record, sometimes return NOT FOUND, it is very weird.
> 
> FOUND case:
> wt=json&q=*:*&rows=1000&start=0&sort=collect_id_l%20asc&fq=agent_uid_i:1191&fq=id_l:2772439&indent=true"
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":57,
>     "params":{
>       "10.100.205.80:8093":"_yz_pn:230 OR _yz_pn:200 OR _yz_pn:170 OR _yz_pn:140 OR _yz_pn:110 OR _yz_pn:80 OR _yz_pn:50 OR _yz_pn:20",
>       "10.100.205.79:8093":"_yz_pn:239 OR _yz_pn:209 OR _yz_pn:179 OR _yz_pn:149 OR _yz_pn:119 OR _yz_pn:89 OR _yz_pn:59 OR _yz_pn:29",
>       "indent":"true",
>       "10.100.205.73:8093":"_yz_pn:233 OR _yz_pn:203 OR _yz_pn:173 OR _yz_pn:143 OR _yz_pn:113 OR _yz_pn:83 OR _yz_pn:53 OR _yz_pn:23",
>       "start":"0",
>       "sort":"collect_id_l asc",
>       "fq":["agent_uid_i:1191",
>         "id_l:2772439"],
>       "rows":"1000",
>       "10.100.205.76:8093":"_yz_pn:236 OR _yz_pn:206 OR _yz_pn:176 OR _yz_pn:146 OR _yz_pn:116 OR _yz_pn:86 OR _yz_pn:56 OR _yz_pn:26",
>       "q":"*:*",
>       "shards":"10.100.205.71:8093/internal_solr/game_data_records_index,10.100.205.72:8093/internal_solr/game_data_records_index,10.100.205.73:8093/internal_solr/game_data_records_index,10.100.205.74:8093/internal_solr/game_data_records_index,10.100.205.75:8093/internal_solr/game_data_records_index,10.100.205.76:8093/internal_solr/game_data_records_index,10.100.205.77:8093/internal_solr/game_data_records_index,10.100.205.78:8093/internal_solr/game_data_records_index,10.100.205.79:8093/internal_solr/game_data_records_index,10.100.205.80:8093/internal_solr/game_data_records_index",
>       "10.100.205.71:8093":"_yz_pn:251 OR _yz_pn:221 OR _yz_pn:191 OR _yz_pn:161 OR _yz_pn:131 OR _yz_pn:101 OR _yz_pn:71 OR _yz_pn:41 OR _yz_pn:11",
>       "10.100.205.74:8093":"_yz_pn:224 OR _yz_pn:194 OR _yz_pn:164 OR _yz_pn:134 OR _yz_pn:104 OR _yz_pn:74 OR _yz_pn:44 OR _yz_pn:14",
>       "10.100.205.77:8093":"_yz_pn:227 OR _yz_pn:197 OR _yz_pn:167 OR _yz_pn:137 OR _yz_pn:107 OR _yz_pn:77 OR _yz_pn:47 OR _yz_pn:17",
>       "10.100.205.78:8093":"_yz_pn:248 OR _yz_pn:218 OR _yz_pn:188 OR _yz_pn:158 OR _yz_pn:128 OR _yz_pn:98 OR _yz_pn:68 OR _yz_pn:38 OR _yz_pn:8",
>       "10.100.205.75:8093":"_yz_pn:255 OR _yz_pn:245 OR _yz_pn:215 OR _yz_pn:185 OR _yz_pn:155 OR _yz_pn:125 OR _yz_pn:95 OR _yz_pn:65 OR _yz_pn:35 OR _yz_pn:5",
>       "wt":"json",
>       "10.100.205.72:8093":"(_yz_pn:252 AND (_yz_fpn:252)) OR _yz_pn:242 OR _yz_pn:212 OR _yz_pn:182 OR _yz_pn:152 OR _yz_pn:122 OR _yz_pn:92 OR _yz_pn:62 OR _yz_pn:32 OR _yz_pn:2"}},
>   "response":{"numFound":1,"start":0,"docs":[
>       {
>         “some_field_1_l":0,
>         "some_field_2_l":0,
>         "some_field_3_s":"[]",
>         "some_field_4_s":"[]",
>         "some_field_5_l":0,
>         "some_field_6_l":0,
>         "some_field_7_s":"[]",
>         "some_field_8_s":"[[1,0],[2,0],[3,0],[4,0],[5,0],[6,0],[7,0],[8,0],[9,0],[10,0],[11,0],[12,0],[13,0],[14,0],[15,0],[16,0],[17,0],[18,0],[19,0],[20,0]]",
>         "collect_id_l":2765608,
>         "some_field_9_i":1191,
>         "some_field_10_i":1191,
>         "some_field_11_i":1,
>         "some_field_12_l":0,
>         "some_field_13_l":2000,
>         "some_field_14_l":764846,
>         "some_field_15_l":766846,
>         "some_field_16_i":57,
>         "some_field_17_i":1,
>         "some_field_18_s":"UTC_-4",
>         "some_field_19_i":1487822270,
>         "some_field_20_s":"61.221.181.7",
>         "some_field_21_l":2869104,
>         "some_field_22_s":"[1,4,10,4,5,3,5,6,8,4,3,2,2,1,10]",
>         "some_field_23_i":20,
>         "some_field_24_i":1,
>         "some_field_25_i":1,
>         "agent_uid_i":1191,
>         "some_field_26_l":2772439,
>         "some_field_27_i":100,
>         "_yz_id":"1*default*data_201702*2772439*203",
>         "_yz_rk":"2772439",
>         "_yz_rt":"default",
>         "_yz_rb":"data_201702"}]
> 
>   }}
> 
> 
> 
> 
> NOTFOUND case:
> 
> wt=json&q=*:*&rows=1000&start=0&sort=collect_id_l%20asc&fq=agent_uid_i:1191&fq=id_l:2772439&indent=true"
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":62,
>     "params":{
>       "10.100.205.80:8093":"_yz_pn:240 OR _yz_pn:210 OR _yz_pn:180 OR _yz_pn:150 OR _yz_pn:120 OR _yz_pn:90 OR _yz_pn:60 OR _yz_pn:30",
>       "10.100.205.79:8093":"_yz_pn:249 OR _yz_pn:219 OR _yz_pn:189 OR _yz_pn:159 OR _yz_pn:129 OR _yz_pn:99 OR _yz_pn:69 OR _yz_pn:39 OR _yz_pn:9",
>       "indent":"true",
>       "10.100.205.73:8093":"(_yz_pn:253 AND (_yz_fpn:253)) OR _yz_pn:243 OR _yz_pn:213 OR _yz_pn:183 OR _yz_pn:153 OR _yz_pn:123 OR _yz_pn:93 OR _yz_pn:63 OR _yz_pn:33 OR _yz_pn:3",
>       "start":"0",
>       "sort":"collect_id_l asc",
>       "fq":["agent_uid_i:1191",
>         "id_l:2772439"],
>       "rows":"1000",
>       "10.100.205.76:8093":"_yz_pn:256 OR _yz_pn:246 OR _yz_pn:216 OR _yz_pn:186 OR _yz_pn:156 OR _yz_pn:126 OR _yz_pn:96 OR _yz_pn:66 OR _yz_pn:36 OR _yz_pn:6",
>       "q":"*:*",
>       "shards":"10.100.205.71:8093/internal_solr/game_data_records_index,10.100.205.72:8093/internal_solr/game_data_records_index,10.100.205.73:8093/internal_solr/game_data_records_index,10.100.205.74:8093/internal_solr/game_data_records_index,10.100.205.75:8093/internal_solr/game_data_records_index,10.100.205.76:8093/internal_solr/game_data_records_index,10.100.205.77:8093/internal_solr/game_data_records_index,10.100.205.78:8093/internal_solr/game_data_records_index,10.100.205.79:8093/internal_solr/game_data_records_index,10.100.205.80:8093/internal_solr/game_data_records_index",
>       "10.100.205.71:8093":"_yz_pn:231 OR _yz_pn:201 OR _yz_pn:171 OR _yz_pn:141 OR _yz_pn:111 OR _yz_pn:81 OR _yz_pn:51 OR _yz_pn:21",
>       "10.100.205.74:8093":"_yz_pn:234 OR _yz_pn:204 OR _yz_pn:174 OR _yz_pn:144 OR _yz_pn:114 OR _yz_pn:84 OR _yz_pn:54 OR _yz_pn:24",
>       "10.100.205.77:8093":"_yz_pn:237 OR _yz_pn:207 OR _yz_pn:177 OR _yz_pn:147 OR _yz_pn:117 OR _yz_pn:87 OR _yz_pn:57 OR _yz_pn:27",
>       "10.100.205.78:8093":"_yz_pn:228 OR _yz_pn:198 OR _yz_pn:168 OR _yz_pn:138 OR _yz_pn:108 OR _yz_pn:78 OR _yz_pn:48 OR _yz_pn:18",
>       "10.100.205.75:8093":"_yz_pn:225 OR _yz_pn:195 OR _yz_pn:165 OR _yz_pn:135 OR _yz_pn:105 OR _yz_pn:75 OR _yz_pn:45 OR _yz_pn:15",
>       "wt":"json",
>       "10.100.205.72:8093":"_yz_pn:252 OR _yz_pn:222 OR _yz_pn:192 OR _yz_pn:162 OR _yz_pn:132 OR _yz_pn:102 OR _yz_pn:72 OR _yz_pn:42 OR _yz_pn:12"}},
>   "response":{"numFound":0,"start":0,"docs":[]
>   }}
> 
> 
> I can select this record from solr webapp from some of these 10 nodes.  So this record should be indexed by solr.  So I guessed the problem cause this is related with Yokozuna’s distributed shards, since one query would shard to every solr instance by Yokozuna, then Yokozuna would collect all the returns and reduce to a fix result, something like map-reduce mechanism.
> 
> So the problem here may underlay in map phase or reduce phase.
> 
> The default configuration of riak search applied in these 10 nodes.
> 
> Would anyone have some insight on how to fix this?  By modified the configuration of search?
> 
> 
> Best regards,
> Witeman
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20170223/1fbb502f/attachment-0002.html>


More information about the riak-users mailing list