riaksearch sort, start, and rows

Gary William Flake gary at flake.org
Mon Feb 28 11:41:28 EST 2011


FWIW, here's my workaround written for riak-js:

done = false;
db.addSearch('clips', q, opts)
    .map('Riak.mapValuesJson')
    .reduce('Riak.reduceSort', 'function(a,b){return a.ctime-b.ctime;}')
    .reduce('Riak.reduceSlice', [start,rows])
    .run(function(err, x) {
for (var i = 0, n = x.length; i < n; i++) {
    console.log(i, x[i].title);
}
done = true;
    })




On Mon, Feb 28, 2011 at 5:35 AM, Rusty Klophaus <rusty at basho.com> wrote:

> Hi Gary,
>
> Yes, unfortunately, this is a bug. (Tracked here:
> https://issues.basho.com/show_bug.cgi?id=867) We have not yet scheduled
> the fix for this.
>
> Best,
> Rusty
>
>
> On Mon, Feb 28, 2011 at 1:25 AM, Gary William Flake <gary at flake.org>wrote:
>
>> Ack.  I know what's happening with this.  Riaksearch is sorting by
>> relevance, segmenting the results according to start and rows, and THEN
>> sorting by the sort key.
>>
>> Is this the intended behavior?  If so, I am kind of surprised because the
>> post sort is easy for the caller to do without Riaksearch's help.  What's
>> hard is the pre-sort.
>>
>> I suppose the work around is to get all of the results, then send the
>> result to map/reduce to correctly handle pre-sort, then filter by range.
>>  However, before I do this, does anyone know if (a) this is considered
>> broken? and (b) if so, is this scheduled to be fixed?
>>
>> Thanks,
>> -- GWF
>>
>>
>>
>> On Sun, Feb 27, 2011 at 10:17 PM, Gary William Flake <gary at flake.org>wrote:
>>
>>> While coding up a front-end to paginate over a set of search results, I
>>> think I found a bug whereby the result set is incorrectly segmented as a
>>> function of the sort key, the start index, and the row count.  In the output
>>> that follows, I am using the field ctime for sort order, and the value of
>>> delta is simply the difference between subsequent ctime values (which I was
>>> checking in order to debug what was happening).
>>>
>>> First, lets ask for just the first result:
>>>
>>> GET
>>>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=1&wt=json&sort=ctime
>>>
>>>
>>>> title: weather flagler beach - Google Search
>>>
>>> ctime: 98701733411258
>>>
>>> delta: 0
>>>
>>> ----------
>>>
>>>
>>>
>>> Now, let's do the same query, but get the top three instead:
>>>
>>> GET
>>>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=3&wt=json&sort=ctime
>>>
>>>
>>>> title: Amazon.com: knives
>>>
>>> ctime: 98701733349673
>>>
>>> delta: 0
>>>
>>> ----------
>>>
>>> title: weather flagler beach - Google Search
>>>
>>> ctime: 98701733411258
>>>
>>> delta: 61585
>>>
>>> ----------
>>>
>>> title: Bartholdi on spacefilling curves
>>>
>>> ctime: 98701733465867
>>>
>>> delta: 54609
>>>
>>> ----------
>>>
>>>
>>>
>>> Notice that we have a new 'top' result.  Finally, let's get all of the
>>> results by setting row to something large, but I'll only show the first
>>> result because that's all you need to see:
>>>
>>> GET
>>>> /solr/clips/select?q=user%3Ad33af3cca29a43e63e8f6a52dfdd99a61f7b7906%20AND%20private%3A1&start=0&rows=1000&wt=json&sort=ctime
>>>
>>>
>>>> title: HTTP cookie - Wikipedia, the free encyclopedia
>>>
>>> ctime: 98701587317266
>>>
>>> delta: 0
>>>
>>> ----------
>>>
>>>
>>>
>>> We now get a record that we haven't seen yet.
>>>
>>>
>>> I can confirm that when I get all of the results, they are in the
>>> properly sorted order.  I also believe that any of my smaller result sets
>>> are also in proper sort order.  However, it also appears that when I ask for
>>> a specific number of row with a non-zero start start value, then the start
>>> index is not handled correctly.  FWIW, these results don't' have anything to
>>> do with the client library because I reproduced them over the REST
>>> interface.
>>>
>>> Any ideas?
>>>
>>> -- GWF
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110228/646566d4/attachment.html>


More information about the riak-users mailing list