Secondary Indexes - Feedback?

Elias Levy fearsome.lucidity at
Wed Nov 30 18:05:20 EST 2011

On Wed, Nov 30, 2011 at 1:32 PM, <riak-users-request at> wrote:

> Here at Clipboard, we make very heavy use of Riak Search and a couple of
> manual indices here and there. I've wanted to use 2i a few times but have
> decided against it for a few reasons:
> 1) Apprehension about the coverage set query, as Matt articulated.

That's a concern, but you gain parallelism, compared to Search's single
term index.

> 2) Lack of ordering of returned results. Generally I just want the top 10
> or so, and the ordering information is in the primary key. I can accomplish
> this with search via the presort parameter.

Not sure why this would be a concern.  Search's presort option must have
the full result set before it can fully sort it, no?  There is no reason
why sorting the results of a a 2i query should be any slower.  In addition,
2i is stored in leveldb, and leveldb, like merge_index if I recall
correctly, stores keys and values sorted. Thus, the result set is already
partially ordered.

To me, the implementations of search and 2I are backwards. Search has
> scalability issues because term-based partitioning optimizes for
> single-term queries, but creates huge hotspots making many AND queries
> prohibitively expensive. 2I's document-based partitioning makes single-term
> queries more expensive (coverage set) but should allow AND queries to
> scale. But 2i only supports single-term queries!

While they are more expensive in the sense that they require more nodes to
participate, they split the load between the nodes, thus overall, the work
should be about the same, and unless the nodes are busy with some other
work, it should complete sooner, as each node has less work to do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list