Experimental branch - 2i query improvements

Olav Frengstad olav at fwt.no
Wed Apr 17 06:06:38 EDT 2013


The features sounds very promising, from a ease-of-use perspective
feature #3 and #4 definitely have great value. Being able to do
multi-conditional 2i queries based on composite keys is something
im personally excited about.

Out of curiosity, have you thought about adding additional flexibility
to the query iterator by being able to add "pluggable" match functions?
Especially I'm thinking of querying ISO-8601 dates that matches a
certain criteria (eg get all data between 08:00-16:00 in august for
all years). At least facilitating for more advanced matching functions
could be a good idea.

Olav

2013/4/16 Martin Sumner <martin.sumner at adaptip.co.uk>:
> I've been working on an experimental branch to offer some improvements to
> the functionality and performance of 2i queries in Riak:
> https://github.com/martinsumner/riak_kv
>
> Explanation:
> https://github.com/martinsumner/riak_kv/blob/master/docs/index_speedup.md
>
> There are four basic features that are included:
> 1. The ability to pin particular 2i indexes into memory (without loss of
> consistency on restart of a node)
> 2. The ability to set partition-level static bloom filters for particular 2i
> indexes to greatly reduce the disk overheads of exact-term queries with
> small result sets (e.g. for queries by a secondary identifier such as email
> address)
> 3. The ability to return indexterms, not just keys as results of a query -
> so that those terms can be overloaded with additional information which can
> then be filtered by the application without requiring a M/R stage (note this
> is already available via Russell Brown's branch -
> https://github.com/basho/riak_kv/tree/pt34-index-values)
> 4. The ability to pass a regular expression to the query iterator - so that
> range queries will be filtered based on matches to that regular expression
> (for example allowing for non-trailing wildcards) before returning the keys
> and terms
>
> Testing is slight at the moment, both functionally and non-functionally.
> This is still very-much an experiment.  We're hoping to do some full scale
> volume testing on the branch in the next couple of weeks.
>
> The branch has been developed to solve some problems we have with edge cases
> in our implementation for the NHS in England - where we have to support
> tracing across an 80M record demographic database.  I'd be interested if
> people thought it had value in other environments.
>
> Regards
>
> Martin
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Med Vennlig Hilsen
Olav Frengstad

Systemutvikler // FWT
+47 920 42 090




More information about the riak-users mailing list