Geospatial advice?

Sean Cribbs sean at basho.com
Tue May 1 13:16:08 EDT 2012


In contrast to Alexander's assessment, I'd say "it depends". I have built
some geospatial indexes on top of Riak using a geohashing scheme based on
the Hilbert space-filling curve. However, I had to choose specific levels
of "zoom" and precompute them. Now that we have secondary indexes, you
could perhaps bypass the precomputation step. In general, if you know the
geometry of the space you want to query, you can fairly trivially compute
the names of the geohashes you need to look up and then either fetch
individual keys for those (if you precompute them), or use MapReduce to
fetch a range of them. It's not automatic, for sure, but the greatest
complexity will be in deciding which granularities of index to support.

On Tue, May 1, 2012 at 12:44 PM, Alexander Sicular <siculars at gmail.com>wrote:

> My advice is to not use Riak. Check mongo or Postgres.
>
>
> @siculars on twitter
> http://siculars.posterous.com
>
> Sent from my iRotaryPhone
>
> On May 1, 2012, at 9:18, Mark Rose <markrose at markrose.ca> wrote:
>
> > Hello everyone!
> >
> > I'm going to be implementing Riak as a storage engine for geographic
> data. Research has lead me to using geohashing as a useful way to filter
> out results outside of a region of interest. However, I've run into some
> stumbling blocks and I'm looking for advice on the best way to proceed.
> >
> > Querying efficiently by geohash involves querying several regions around
> a point. From what I can tell, Riak offers no way to query a secondary
> index with multiple ranges. Having to query a several ranges, merge them in
> the application layer, then pass them off to mapreduce seems rather silly
> (and could mean passing GBs of data). Alternatively, I could start straight
> with mapreduce, but key filtering seems to work only with the primary key,
> which would force me into using the geohashed location as the primary key
> (which would lead to collisions if two things existed at the same point).
> I'd also like to avoid using the primary key as the geohash as if the item
> moves I'd have to change all the references to it. Lastly, I could do a
> less efficient mapreduce over a less precise geohash, but this doesn't
> solve the issue of the equator (anything near the equator would require
> mapreducing the entire dataset).
> >
> > Is there any way to query multiple ranges with a secondary index and
> pass that off to mapreduce? Or should I just stick with the less efficient
> mapreduce, and when near the equator, run two queries and later merge them?
> Or am I going about this the wrong way?
> >
> > In any case, the final stage of my queries will involve mapreduce as
> I'll need to further filter the items found in a region.
> >
> > Thank you,
> > Mark
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs <sean at basho.com>
Software Engineer
Basho Technologies, Inc.
http://basho.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120501/a49a7c72/attachment.html>


More information about the riak-users mailing list