riak search and solr/lucene

Joseph Lambert joseph.g.lambert at gmail.com
Thu Nov 4 19:34:55 EDT 2010


Sorry, I meant Lucene search. Solr can be passed start and count, Lucene
search can't be, but they share functions in the Erlang code.

- Joe Lambert

joseph.g.lambert at gmail.com
+86 13656213284

On Fri, Nov 5, 2010 at 2:15 AM, Rusty Klophaus <rusty at basho.com> wrote:

> Hi Joseph,
> Answers inline below.
> On Thu, Nov 4, 2010 at 12:49 AM, Joseph Lambert <
> joseph.g.lambert at gmail.com> wrote:
>> I am using the PHP library for a project and was looking through the code
>> to see what differentiates the Solr HTTP interface query versus the Lucene
>> search (besides the syntax and the interface, etc) as paging is very useful
>> for my code. From the PHP library with lucene I can do a search with lucene,
>> then a reduce job to sort, then another reduce to slice the results. With
>> Solr, we can just do a cURL with the parameters to do the same thing.
>> I scanned the Erlang code, and in the end, both call stream_search(), but
>> the Lucene query will pass the results back to luke for possibly another MR
>> phase, and the Solr query simply sorts and truncates the list. So:
>> 1. Does anyone have a general idea at what point the Solr query will start
>> to get really slow as far as number of keys in a bucket and other factors? I
>> know this is dependent on many things, just looking for a rough idea of when
>> it's a bad idea to use the Solr interface.
> The Solr interface works by running the query to find your list of keys
> (limited based on the "start" and "rows" parameters) and then looking up the
> keys in Riak KV. So if you execute a Solr request with "rows=100", your
> request will take a certain amount of time to execute the query, plus
> however long it takes to retrieve 100 objects in your cluster.
>> 2. Also, I see that Riak will cache the map phase of a map reduce, so will
>> it cache the initial search? Or does it use some other mechanism I'm not
>> seeing to cache search results?
>  The system does not cache Search results, though the operating system's
> disk caching does make repeated search results execute more quickly.
>> 3. Finally, for the Solr query, why not automatically add a sort and/or
>> slice phase if the user passes in sort, start or count parameters in the
>> Solr query?
> Not sure I understand the question here, can you clarify/elaborate? The
> system does support sort and slice parameters. (
> https://wiki.basho.com/display/RIAK/Riak+Search+-+Querying)
>> Please correct me if any of the assumptions I made are wrong, as usually
>> when I ask these questions I end up with my foot in my mouth.
>> - Joe Lambert
>> joseph.g.lambert at gmail.com
>> +86 13656213284
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101105/81c139a2/attachment.html>

More information about the riak-users mailing list