riak search - creating many indexes for one inserted object

Ryan Zezeski rzezeski at basho.com
Fri Oct 19 12:21:48 EDT 2012


Pawel,

On Tue, Oct 9, 2012 at 5:21 PM, kamiseq <kamiseq at gmail.com> wrote:

> hi all,
>
> right now we are using solr as search index and we are inserting data
> manually. so there is nothing to stop us from creating many indexes
> (sort of views) on same entity, aggregate data and so on.
> can something like that be achieved with riak search??
>

Just to be sure I understand you.  When you say "many indexes" do you mean
something like writing to multiple Solr cores?  If so, no, Riak Search
cannot do that.  It writes to an index named after the bucket you have the
hook on.


> I think that commit hooks are good point to start with but as I read
> search index is kept in different format than bucket data and I would
> love to still use solr-like api to search the index.
>

Yes, Riak Search stores index data in a backend called Merged Index.  Riak
Search has a Solr _like_ interface but it lacks many features, and doesn't
have the same semantics or performance characteristics.

There is a new project underway called Yokozuna which tightly integrates
Riak and Solr.  If you like Solr then keep an eye on this.  I'm looking for
people who want to prototype on it so if that interests you please email me
directly.

https://github.com/rzezeski/yokozuna


> example
>
> I have two entities cars and parking_lots, each car references parking
> lot it belongs to.
> when I create/update/delete car object I would like to not only update
> car index (so I can search by car type, name, number plates, etc) but
> also update parking index to easily check how many cars given lot has
> (plus search lots by cars, or search cars with given property).
>

Why have a separate index at all?  Is it not good enough to have just the
car index.  Each doc would have a 'parking_lot_s' field.

"How many cars a given lot has" -- would be numFound on q=parking_lot_s:foo.

"Search lots by cars" -- I'm guessing you mean something like "tell me what
lots have cars like this", sounds like a facet on 'parking_lot_s', right?

"Search cars with a given property" -- like the last query but no facet.


> probably all this can be achieved in many other ways. I can imagine
> storing array of direct references in parking object and update this
> object when car object also changed. but this way I need to issue two
> asynchronous write request with no guaranties that both will be
> persisted.
>

Yes.  This is a problem with two Solr cores as well.  I'm not sure if this
is a toy example but I don't see the need for 2 indexes.  I potentially see
2 buckets: 'cars' and 'lots'.  But that doesn't mean it has to be two
indexes.  Does that make sense?

-Z
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121019/7b5f0466/attachment.html>


More information about the riak-users mailing list