riak search and solr/lucene

Rusty Klophaus rusty at basho.com
Mon Nov 8 08:10:54 EST 2010


Hi Joseph,

Ah okay, I see what you are saying now. That's a good idea, and a nice
simplification to the code. Thanks for the suggestion!

This is now tracked here: https://issues.basho.com/show_bug.cgi?id=870

Best,
Rusty

On Thu, Nov 4, 2010 at 8:02 PM, Joseph Lambert
<joseph.g.lambert at gmail.com>wrote:

> Disregard that last message. What I meant was, in a Solr query, all the
> results are returned and then it sorts and then takes the chunk that is
> requested by the start and count parameters. Why not instead make the
> results of the search() function the input of a MapReduce job, and if the
> user adds sorting and then start and count parameters, add two reduce jobs,
> one a sort and one a slice. Would that not improve the Solr search results?
> Or do I not understand correctly?
>
>
> - Joe Lambert
>
> joseph.g.lambert at gmail.com
> +86 13656213284
>
>
> On Fri, Nov 5, 2010 at 7:34 AM, Joseph Lambert <joseph.g.lambert at gmail.com
> > wrote:
>
>> Rusty,
>>
>> Sorry, I meant Lucene search. Solr can be passed start and count, Lucene
>> search can't be, but they share functions in the Erlang code.
>>
>> - Joe Lambert
>>
>> joseph.g.lambert at gmail.com
>> +86 13656213284
>>
>>
>> On Fri, Nov 5, 2010 at 2:15 AM, Rusty Klophaus <rusty at basho.com> wrote:
>>
>>> Hi Joseph,
>>>
>>> Answers inline below.
>>>
>>> On Thu, Nov 4, 2010 at 12:49 AM, Joseph Lambert <
>>> joseph.g.lambert at gmail.com> wrote:
>>>
>>>> I am using the PHP library for a project and was looking through the
>>>> code to see what differentiates the Solr HTTP interface query versus the
>>>> Lucene search (besides the syntax and the interface, etc) as paging is very
>>>> useful for my code. From the PHP library with lucene I can do a search with
>>>> lucene, then a reduce job to sort, then another reduce to slice the results.
>>>> With Solr, we can just do a cURL with the parameters to do the same thing.
>>>>
>>>> I scanned the Erlang code, and in the end, both call stream_search(),
>>>> but the Lucene query will pass the results back to luke for possibly another
>>>> MR phase, and the Solr query simply sorts and truncates the list. So:
>>>>
>>>> 1. Does anyone have a general idea at what point the Solr query will
>>>> start to get really slow as far as number of keys in a bucket and other
>>>> factors? I know this is dependent on many things, just looking for a rough
>>>> idea of when it's a bad idea to use the Solr interface.
>>>>
>>>
>>> The Solr interface works by running the query to find your list of keys
>>> (limited based on the "start" and "rows" parameters) and then looking up the
>>> keys in Riak KV. So if you execute a Solr request with "rows=100", your
>>> request will take a certain amount of time to execute the query, plus
>>> however long it takes to retrieve 100 objects in your cluster.
>>>
>>>
>>>> 2. Also, I see that Riak will cache the map phase of a map reduce, so
>>>> will it cache the initial search? Or does it use some other mechanism I'm
>>>> not seeing to cache search results?
>>>>
>>>
>>>  The system does not cache Search results, though the operating system's
>>> disk caching does make repeated search results execute more quickly.
>>>
>>>
>>>> 3. Finally, for the Solr query, why not automatically add a sort and/or
>>>> slice phase if the user passes in sort, start or count parameters in the
>>>> Solr query?
>>>>
>>>
>>> Not sure I understand the question here, can you clarify/elaborate? The
>>> system does support sort and slice parameters. (
>>> https://wiki.basho.com/display/RIAK/Riak+Search+-+Querying)
>>>
>>>
>>>>
>>>> Please correct me if any of the assumptions I made are wrong, as usually
>>>> when I ask these questions I end up with my foot in my mouth.
>>>>
>>>> - Joe Lambert
>>>>
>>>> joseph.g.lambert at gmail.com
>>>> +86 13656213284
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101108/d8028517/attachment.html>


More information about the riak-users mailing list