riaksearch performance, row limit, sorting not necessary

Gordon Tillman gtillman at mezeo.com
Thu Apr 14 10:53:18 EDT 2011

Daniel the max_search_results only applies to searches done via the solr interface.  From http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html:

- System now aborts queries that would queue up too many documents in
  a result set. This is controlled by a 'max_search_results' setting
  in riak_search. Note that this only affects the Solr
  interface. Searches through the Riak Client API that feed into a
  Map/Reduce job are still allowed to execute because the system
  streams those results.

So you can use a map-reduce operation (with the search phase providing the inputs) and you should be OK.


On Apr 14, 2011, at 04:49 , Daniel Rathbone wrote:

Hi list,

I'm wondering how riaksearch performance will degrade as I add documents.

For my purpose I limit rows at 1k and sorting is not necessary.  I have a single node cluster for development.  I know I can increase performance if I add nodes but I'd like to understand this before I do.

My documents are small ~200 bytes.  With an index of 30k and rows limited to 1k, no problems.  I added 100k documents, and then I hit the too_many_results error.  Since I still have my row limit set at 1k, this indicates that the query does not return as soon as it finds the first 1k hits.  Is there a way to short circuit my queries so that they don't have to scan the whole index?

I got around too_many_results by increasing my max_search_results (I read https://help.basho.com/entries/480664-i-get-the-error-too-many-results).  I wonder, though, if I'll keep bumping memory boundaries as I add a few million docs to my index.


