Ryan Zezeski rzezeski at basho.com
Tue Apr 24 16:37:40 EDT 2012

On Tue, Apr 24, 2012 at 1:29 PM, Elias Levy <fearsome.lucidity at gmail.com>wrote:

> On Mon, Apr 23, 2012 at 6:26 PM, <riak-users-request at lists.basho.com>wrote:
>> I'm also doing work on this to make conjunction queries safer, do less
>> work, and have better latencies.  A query that produces a "large" result
>> set is still problematic but a conjunction of small and large result sets
>> will be much, much better.  I saw very significant improvement on a
>> particular work load but I need to do more benchmarking on "real"
>> hardware.
>> https://github.com/rzezeski/riak_search/tree/rz-conjunction
>> https://github.com/rzezeski/riak_search/tree/rz-no-unnecessary-work
> I saw some code in there about term frequency.  Does that mean that some
> day we may see a faceting API?
> That sort of thing would be incredibly useful for generating statistics
> without having to compute them on our own.

I would love to see faceting support in Search.  It would be good for
generating statistics, as you say, and dynamic toplogies.  The term
frequency in Search currently is an estimation (sometimes it is exact).  I
did toy around with a pure Erlang term-frequency index but haven't done
much with it.  It's my intention to explore this more in the future as I
think an efficient term-freq index is very important.  I'd also like to
look at any C search libs out there to potentially augment/replace merge
index with.  I haven't spent much time thinking about how to implement
facets.  I imagine you also need an efficient field->unique terms mapping
and since Riak Search is distributed a covering query is needed to collate
everything.  Like everything else, it's just a matter of throwing together
some code ;)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120424/99e24dce/attachment.html>

More information about the riak-users mailing list