Geospatial advice?

Mark Rose markrose at markrose.ca
Tue May 1 09:18:34 EDT 2012


Hello everyone!

I'm going to be implementing Riak as a storage engine for geographic data.
Research has lead me to using geohashing as a useful way to filter out
results outside of a region of interest. However, I've run into some
stumbling blocks and I'm looking for advice on the best way to proceed.

Querying efficiently by geohash involves querying several regions around a
point. From what I can tell, Riak offers no way to query a secondary index
with multiple ranges. Having to query a several ranges, merge them in the
application layer, then pass them off to mapreduce seems rather silly (and
could mean passing GBs of data). Alternatively, I could start straight with
mapreduce, but key filtering seems to work only with the primary key, which
would force me into using the geohashed location as the primary key (which
would lead to collisions if two things existed at the same point). I'd also
like to avoid using the primary key as the geohash as if the item moves I'd
have to change all the references to it. Lastly, I could do a less
efficient mapreduce over a less precise geohash, but this doesn't solve the
issue of the equator (anything near the equator would require mapreducing
the entire dataset).

Is there any way to query multiple ranges with a secondary index and pass
that off to mapreduce? Or should I just stick with the less efficient
mapreduce, and when near the equator, run two queries and later merge them?
Or am I going about this the wrong way?

In any case, the final stage of my queries will involve mapreduce as I'll
need to further filter the items found in a region.

Thank you,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120501/fd6ab769/attachment.html>


More information about the riak-users mailing list