riaksearch performance, row limit, sorting not necessary

Daniel Rathbone dan.rathbone at gmail.com
Thu Apr 14 13:18:21 EDT 2011

To be clear, I'm only talking about the solr interface.  I'm wondering if my
query time will remain fixed (since it's capped at rows=1000) as I add
several million docs to the index.

If I use my search as an input into Map/Reduce, won't my response time grow
with my index? My search query would queue up a very large result set - and
I expect performance to suffer if I trim this down in a reduce phase.

It would seem that I can prevent that slowdown by limiting the rows in the
search (with rows=1000).  Despite that limit, though, I hit the
too_many_results error which indicates that the search queues up a very
large result set before it applies the row limit.  Is there something I'm
missing here?


Basically, I'm wondering if my query time will remain

On Thu, Apr 14, 2011 at 7:53 AM, Gordon Tillman <gtillman at mezeo.com> wrote:

> Daniel the max_search_results only applies to searches done via the solr
> interface.  From
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html
> :
> - System now aborts queries that would queue up too many documents in
>   a result set. This is controlled by a 'max_search_results' setting
>   in riak_search. Note that this only affects the Solr
>   interface. Searches through the Riak Client API that feed into a
>   Map/Reduce job are still allowed to execute because the system
>   streams those results.
> So you can use a map-reduce operation (with the search phase providing the
> inputs) and you should be OK.
> --gordon
> <http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-January/002974.html>
> On Apr 14, 2011, at 04:49 , Daniel Rathbone wrote:
> Hi list,
> I'm wondering how riaksearch performance will degrade as I add documents.
> For my purpose I limit rows at 1k and sorting is not necessary.  I have a
> single node cluster for development.  I know I can increase performance if I
> add nodes but I'd like to understand this before I do.
> My documents are small ~200 bytes.  With an index of 30k and rows limited
> to 1k, no problems.  I added 100k documents, and then I hit
> the too_many_results error.  Since I still have my row limit set at 1k, this
> indicates that the query does not return as soon as it finds the first 1k
> hits.  Is there a way to short circuit my queries so that they don't have to
> scan the whole index?
> I got around too_many_results by increasing my max_search_results (I read
> https://help.basho.com/entries/480664-i-get-the-error-too-many-results).
>  I wonder, though, if I'll keep bumping memory boundaries as I add a few
> million docs to my index.
> Thanks,
> Daniel
> <ATT00001..txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110414/abd9934e/attachment.html>

More information about the riak-users mailing list