riak search and solr/lucene

Joseph Lambert joseph.g.lambert at gmail.com
Thu Nov 4 03:49:51 EDT 2010


I am using the PHP library for a project and was looking through the code to
see what differentiates the Solr HTTP interface query versus the Lucene
search (besides the syntax and the interface, etc) as paging is very useful
for my code. From the PHP library with lucene I can do a search with lucene,
then a reduce job to sort, then another reduce to slice the results. With
Solr, we can just do a cURL with the parameters to do the same thing.

I scanned the Erlang code, and in the end, both call stream_search(), but
the Lucene query will pass the results back to luke for possibly another MR
phase, and the Solr query simply sorts and truncates the list. So:

1. Does anyone have a general idea at what point the Solr query will start
to get really slow as far as number of keys in a bucket and other factors? I
know this is dependent on many things, just looking for a rough idea of when
it's a bad idea to use the Solr interface.
2. Also, I see that Riak will cache the map phase of a map reduce, so will
it cache the initial search? Or does it use some other mechanism I'm not
seeing to cache search results?
3. Finally, for the Solr query, why not automatically add a sort and/or
slice phase if the user passes in sort, start or count parameters in the
Solr query?

Please correct me if any of the assumptions I made are wrong, as usually
when I ask these questions I end up with my foot in my mouth.

- Joe Lambert

joseph.g.lambert at gmail.com
+86 13656213284
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101104/7a9499c1/attachment.html>


More information about the riak-users mailing list