Secondary Indexes - Feedback?

Greg Pascale greg at clipboard.com
Wed Nov 30 13:09:47 EST 2011


Here at Clipboard, we make very heavy use of Riak Search and a couple of manual indices here and there. I've wanted to use 2i a few times but have decided against it for a few reasons:

1) Apprehension about the coverage set query, as Matt articulated.

2) Lack of ordering of returned results. Generally I just want the top 10 or so, and the ordering information is in the primary key. I can accomplish this with search via the presort parameter. 

To me, the implementations of search and 2I are backwards. Search has scalability issues because term-based partitioning optimizes for single-term queries, but creates huge hotspots making many AND queries prohibitively expensive. 2I's document-based partitioning makes single-term queries more expensive (coverage set) but should allow AND queries to scale. But 2i only supports single-term queries!

-- 
Greg
Clipboard

On Monday, November 21, 2011 at 10:18 PM, Fyodor Yarochkin wrote:

> > Have you tried Secondary Indexes?
> > Does the feature help solve your problems? If not, why not? Any concerns?
> > What is your wish list for the future of Secondary Indexes?
> 
> yup. I think secondary indexes is probably one of the most-wanted
> options for this release. It does impact how you are able to model
> your data alot. We discussed the data modeling patterns internally
> here, and the cool thing with secondary indexes is that it is not only
> queries are possible but also the secondary index name could be
> throught of as dynamic variable. Thus, as long as you can predict the
> secondary index name, you can pretty much use it as indexed field in
> SQL data model. One thing we have not tested yet though: if there is
> a limit on number of secondary indexes for a single object, and how
> the system would behave if the number of secondary indexes for a
> particular object is huge.
> 
> Another limitation (or wouldbegoodtohave :-)) thing that we have
> noticed is that there is no straight-forward way to query data by
> multiple secondary indexes at once. You can either do key filtering,
> or do one query, feed it to map job, and then reduce by removing
> entries that do not much 2nd criteria, but not query by
> secondaryAval_int/2 and secondaryBval_int/4. This said, I haven't
> really looked into inner workings of secondary indexes implementation,
> so I am simply commenting on this from a user perspective.
> 
> Other than this would be interesting to hear some comparisons on
> performance of secondary index queries vs. SOLR indexes (riak_search),
> in our experience secondary indexes perform way faster on large volume
> of data but this could could be just my impression.
> 
> cheers,
> -Fyodor
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com (mailto:riak-users at lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111130/fe4807d2/attachment.html>


More information about the riak-users mailing list