Geospatial advice?

Mark Rose markrose at markrose.ca
Tue May 1 14:09:41 EDT 2012


Well, I'd be indexing items over the entire globe. I'd be be looking at
resolutions from an entire world view down to city block. I'm thinking of
using geohashes as an index to restrict the result set, then further
filtering and sorting by mapreducing the remaining items. So I only need
enough granularity to reduce the number of items to a reasonable amount. At
the world view level, I'd filter out most results using mapreduce, but the
local-level queries would be far more common so an index would be highly
advantageous. The geometry I'd want to query would be a window that
arbitrarily overlaps one or more geohash regions. Basically, think plotting
items in say, Google Maps.

Can you use a secondary index inside mapreduce? I haven't seen any examples
of it. I have only seen a secondary index being used to feed a mapreduce. I
am new to Riak.

I imagine my number of points would be at most 100 items per square km, but
typically less than 1 per square km. A 1 km resolution would be sufficient.
A 32 bit geohash would cover that fine. Vast regions of the Earth would
contain no points at all.

-Mark

On Tue, May 1, 2012 at 1:16 PM, Sean Cribbs <sean at basho.com> wrote:

> In contrast to Alexander's assessment, I'd say "it depends". I have built
> some geospatial indexes on top of Riak using a geohashing scheme based on
> the Hilbert space-filling curve. However, I had to choose specific levels
> of "zoom" and precompute them. Now that we have secondary indexes, you
> could perhaps bypass the precomputation step. In general, if you know the
> geometry of the space you want to query, you can fairly trivially compute
> the names of the geohashes you need to look up and then either fetch
> individual keys for those (if you precompute them), or use MapReduce to
> fetch a range of them. It's not automatic, for sure, but the greatest
> complexity will be in deciding which granularities of index to support.
>
> On Tue, May 1, 2012 at 12:44 PM, Alexander Sicular <siculars at gmail.com>wrote:
>
>> My advice is to not use Riak. Check mongo or Postgres.
>>
>>
>> @siculars on twitter
>> http://siculars.posterous.com
>>
>> Sent from my iRotaryPhone
>>
>> On May 1, 2012, at 9:18, Mark Rose <markrose at markrose.ca> wrote:
>>
>> > Hello everyone!
>> >
>> > I'm going to be implementing Riak as a storage engine for geographic
>> data. Research has lead me to using geohashing as a useful way to filter
>> out results outside of a region of interest. However, I've run into some
>> stumbling blocks and I'm looking for advice on the best way to proceed.
>> >
>> > Querying efficiently by geohash involves querying several regions
>> around a point. From what I can tell, Riak offers no way to query a
>> secondary index with multiple ranges. Having to query a several ranges,
>> merge them in the application layer, then pass them off to mapreduce seems
>> rather silly (and could mean passing GBs of data). Alternatively, I could
>> start straight with mapreduce, but key filtering seems to work only with
>> the primary key, which would force me into using the geohashed location as
>> the primary key (which would lead to collisions if two things existed at
>> the same point). I'd also like to avoid using the primary key as the
>> geohash as if the item moves I'd have to change all the references to it.
>> Lastly, I could do a less efficient mapreduce over a less precise geohash,
>> but this doesn't solve the issue of the equator (anything near the equator
>> would require mapreducing the entire dataset).
>> >
>> > Is there any way to query multiple ranges with a secondary index and
>> pass that off to mapreduce? Or should I just stick with the less efficient
>> mapreduce, and when near the equator, run two queries and later merge them?
>> Or am I going about this the wrong way?
>> >
>> > In any case, the final stage of my queries will involve mapreduce as
>> I'll need to further filter the items found in a region.
>> >
>> > Thank you,
>> > Mark
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
>
> --
> Sean Cribbs <sean at basho.com>
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120501/3c647d48/attachment.html>


More information about the riak-users mailing list