Yokozuna queries slow

Steve Garon steve.garon at gmail.com
Tue Apr 21 17:41:48 EDT 2015


Zeeshan,

For that specific case, you guys should add *{!cache=false}* in front on
your query plan. Therefore, queries on large index won't be slowed down.
I'd really like to see some of the solrconfig.xml config to be exported to
the riak bucket properties. The caching flag could be a property on the
bucket. Same for soft commit timeouts. We had to increase soft commit
timeouts to 10sec instead of the 1sec default.


Steve

On 21 April 2015 at 16:02, Zeeshan Lakhani <zlakhani at basho.com> wrote:

> Nice Steve.
>
> Zeeshan Lakhani
> programmer |
> software engineer at @basho |
> org. member/founder of @papers_we_love | paperswelove.org
> twitter => @zeeshanlakhani
>
> On Apr 21, 2015, at 3:57 PM, Steve Garon <steve.garon at gmail.com> wrote:
>
> Jason,
>
> Comment the <filterCache .../> section in the bucket's solrconfig.xml and
> restart riak. Now your queries will be fast again :-)
>
>
> Steve
>
> On 21 April 2015 at 04:24, Zeeshan Lakhani <zlakhani at basho.com> wrote:
>
>> No real workaround other than what you described or looking into
>> config/fq-no-cache settings as mentioned in
>> http://lucidworks.com/blog/advanced-filter-caching-in-solr/ and playing
>> around with those.
>>
>> Riak is now at 2.1.0. I hope that one of the next few point releases will
>> see the fix.
>>
>>
>> Zeeshan Lakhani
>> programmer |
>> software engineer at @basho |
>> org. member/founder of @papers_we_love | paperswelove.org
>> twitter => @zeeshanlakhani
>>
>> On Apr 21, 2015, at 4:11 AM, Jason Campbell <xiaclo at xiaclo.net> wrote:
>>
>> Thanks Zeeshan for the info.
>>
>> Is there a workaround in the mean time, or is the only option to handle
>> queries to the individual nodes ourselves?
>>
>> Is there a planned timeframe for the 2.0.1 release?
>>
>> Thanks,
>> Jason
>>
>> On 21 Apr 2015, at 16:13, Zeeshan Lakhani <zlakhani at basho.com> wrote:
>>
>> Hey Jason,
>>
>> We’re working on performance issues with YZ filter queries, e.g.
>> https://github.com/basho/yokozuna/issues/392, and coverage plan
>> generation/caching, and our CliServ team has started doing a ton of
>> benchmarks as well.
>>
>> You can bypass YZ, but then you’d have to create a way to generate your
>> own coverage plans and other things involving distributed solr that YZ
>> gives you. Nonetheless, we’re actively working on improving these issues
>> you’ve encountered.
>>
>> Zeeshan Lakhani
>> programmer |
>> software engineer at @basho |
>> org. member/founder of @papers_we_love | paperswelove.org
>> twitter => @zeeshanlakhani
>>
>> On Apr 21, 2015, at 1:06 AM, Jason Campbell <xiaclo at xiaclo.net> wrote:
>>
>> Hello,
>>
>> I'm currently trying to debug slow YZ queries, and I've narrowed down the
>> issue, but not sure how to solve it.
>>
>> First off, we have about 80 million records in Riak (and YZ), but the
>> queries return relatively few (a thousand or so at most).  Our query times
>> are anywhere from 800ms to 1.5s.
>>
>> I have been experimenting with queries directly on the Solr node, and it
>> seems to be a problem with YZ and the way it does vnode filters.
>>
>> Here is the same query, emulating YZ first:
>>
>> {
>> "responseHeader":{
>>   "status":0,
>>   "QTime":958,
>>   "params":{
>>     "q":"timestamp:[1429579919010 TO 1429579921010]",
>>     "indent":"true",
>>     "fq":"_yz_pn:55 OR _yz_pn:40 OR _yz_pn:25 OR _yz_pn:10",
>>     "rows":"0",
>>     "wt":"json"}},
>> "response":{"numFound":80,"start":0,"docs":[]
>> }}
>>
>> And the same query, but including the vnode filter in the main body
>> instead of using a filter query:
>>
>> {
>> "responseHeader":{
>>   "status":0,
>>   "QTime":1,
>>   "params":{
>>     "q":"timestamp:[1429579919010 TO 1429579921010] AND (_yz_pn:55 OR
>> _yz_pn:40 OR _yz_pn:25 OR _yz_pn:10)",
>>     "indent":"true",
>>     "rows":"0",
>>     "wt":"json"}},
>> "response":{"numFound":80,"start":0,"docs":[]
>> }}
>>
>> I understand there is a caching benefit to using filter queries, but a
>> performance difference of 100x or greater doesn't seem worth it, especially
>> with a constant data stream.
>>
>> Is there a way to make YZ do this, or is the only way to query Solr
>> directly, bypassing YZ?  Does anyone have any other suggestions of how to
>> make this faster?
>>
>> The timestamp field is a SolrTrieLongField with default settings if
>> anyone is curious.
>>
>> Thanks,
>> Jason
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150421/ae06c2bc/attachment-0002.html>


More information about the riak-users mailing list