riaksearch performance, row limit, sorting not necessary
gtillman at mezeo.com
Thu Apr 14 10:53:18 EDT 2011
Daniel the max_search_results only applies to searches done via the solr interface. From http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html:
- System now aborts queries that would queue up too many documents in
a result set. This is controlled by a 'max_search_results' setting
in riak_search. Note that this only affects the Solr
interface. Searches through the Riak Client API that feed into a
Map/Reduce job are still allowed to execute because the system
streams those results.
So you can use a map-reduce operation (with the search phase providing the inputs) and you should be OK.
On Apr 14, 2011, at 04:49 , Daniel Rathbone wrote:
I'm wondering how riaksearch performance will degrade as I add documents.
For my purpose I limit rows at 1k and sorting is not necessary. I have a single node cluster for development. I know I can increase performance if I add nodes but I'd like to understand this before I do.
My documents are small ~200 bytes. With an index of 30k and rows limited to 1k, no problems. I added 100k documents, and then I hit the too_many_results error. Since I still have my row limit set at 1k, this indicates that the query does not return as soon as it finds the first 1k hits. Is there a way to short circuit my queries so that they don't have to scan the whole index?
I got around too_many_results by increasing my max_search_results (I read https://help.basho.com/entries/480664-i-get-the-error-too-many-results). I wonder, though, if I'll keep bumping memory boundaries as I add a few million docs to my index.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users