riaksearch sort, start, and rows

Gary William Flake gary at flake.org
Mon Feb 28 01:25:48 EST 2011


Ack.  I know what's happening with this.  Riaksearch is sorting by
relevance, segmenting the results according to start and rows, and THEN
sorting by the sort key.

Is this the intended behavior?  If so, I am kind of surprised because the
post sort is easy for the caller to do without Riaksearch's help.  What's
hard is the pre-sort.

I suppose the work around is to get all of the results, then send the result
to map/reduce to correctly handle pre-sort, then filter by range.  However,
before I do this, does anyone know if (a) this is considered broken? and (b)
if so, is this scheduled to be fixed?

Thanks,
-- GWF



On Sun, Feb 27, 2011 at 10:17 PM, Gary William Flake <gary at flake.org> wrote:

> While coding up a front-end to paginate over a set of search results, I
> think I found a bug whereby the result set is incorrectly segmented as a
> function of the sort key, the start index, and the row count.  In the output
> that follows, I am using the field ctime for sort order, and the value of
> delta is simply the difference between subsequent ctime values (which I was
> checking in order to debug what was happening).
>
> First, lets ask for just the first result:
>
> GET
>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=1&wt=json&sort=ctime
>
>
>> title: weather flagler beach - Google Search
>
> ctime: 98701733411258
>
> delta: 0
>
> ----------
>
>
>
> Now, let's do the same query, but get the top three instead:
>
> GET
>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=3&wt=json&sort=ctime
>
>
>> title: Amazon.com: knives
>
> ctime: 98701733349673
>
> delta: 0
>
> ----------
>
> title: weather flagler beach - Google Search
>
> ctime: 98701733411258
>
> delta: 61585
>
> ----------
>
> title: Bartholdi on spacefilling curves
>
> ctime: 98701733465867
>
> delta: 54609
>
> ----------
>
>
>
> Notice that we have a new 'top' result.  Finally, let's get all of the
> results by setting row to something large, but I'll only show the first
> result because that's all you need to see:
>
> GET
>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=1000&wt=json&sort=ctime
>
>
>> title: HTTP cookie - Wikipedia, the free encyclopedia
>
> ctime: 98701587317266
>
> delta: 0
>
> ----------
>
>
>
> We now get a record that we haven't seen yet.
>
>
> I can confirm that when I get all of the results, they are in the properly
> sorted order.  I also believe that any of my smaller result sets are also in
> proper sort order.  However, it also appears that when I ask for a specific
> number of row with a non-zero start start value, then the start index is not
> handled correctly.  FWIW, these results don't' have anything to do with the
> client library because I reproduced them over the REST interface.
>
> Any ideas?
>
> -- GWF
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110227/bc34dbcf/attachment.html>


More information about the riak-users mailing list