Geospatial advice?

Mark Rose markrose at
Tue May 1 15:02:28 EDT 2012

In general I've been shying away from datastores that aren't
highly-available. In a world of zero-downtime expectations, single box
solutions are out. Galera is nice on the SQL side but isn't scalable beyond
a few boxes. I am also looking for a tool that offers mapreduce, which
eliminates any SQL tool I know of. MongoDB might have sharding and
mapreduce, but suffers from a global insert write lock and doesn't
guarantee data persistence. The best comparison I've found of the different
datastores is at .
Riak appeals to me for its high scalability, plus the ability to add new
nodes/CPU easily.


On Tue, May 1, 2012 at 1:32 PM, Will Moss <wmoss at> wrote:

> I remember someone once going on a rant about how there's no silver bullet
> database (If you have not read this<>,
> do so), so I'm, of course, going to agree Sean.
> If you're going to need to run this on more than one machine then going
> with something like Riak makes more sense. Postges has no build in sharding
> functionality, and it's not clear to me<>that MongoDB's 2d indexes work in a sharded configuration.
> Will
> On Tue, May 1, 2012 at 10:16 AM, Sean Cribbs <sean at> wrote:
>> In contrast to Alexander's assessment, I'd say "it depends". I have built
>> some geospatial indexes on top of Riak using a geohashing scheme based on
>> the Hilbert space-filling curve. However, I had to choose specific levels
>> of "zoom" and precompute them. Now that we have secondary indexes, you
>> could perhaps bypass the precomputation step. In general, if you know the
>> geometry of the space you want to query, you can fairly trivially compute
>> the names of the geohashes you need to look up and then either fetch
>> individual keys for those (if you precompute them), or use MapReduce to
>> fetch a range of them. It's not automatic, for sure, but the greatest
>> complexity will be in deciding which granularities of index to support.
>> On Tue, May 1, 2012 at 12:44 PM, Alexander Sicular <siculars at>wrote:
>>> My advice is to not use Riak. Check mongo or Postgres.
>>> @siculars on twitter
>>> Sent from my iRotaryPhone
>>> On May 1, 2012, at 9:18, Mark Rose <markrose at> wrote:
>>> > Hello everyone!
>>> >
>>> > I'm going to be implementing Riak as a storage engine for geographic
>>> data. Research has lead me to using geohashing as a useful way to filter
>>> out results outside of a region of interest. However, I've run into some
>>> stumbling blocks and I'm looking for advice on the best way to proceed.
>>> >
>>> > Querying efficiently by geohash involves querying several regions
>>> around a point. From what I can tell, Riak offers no way to query a
>>> secondary index with multiple ranges. Having to query a several ranges,
>>> merge them in the application layer, then pass them off to mapreduce seems
>>> rather silly (and could mean passing GBs of data). Alternatively, I could
>>> start straight with mapreduce, but key filtering seems to work only with
>>> the primary key, which would force me into using the geohashed location as
>>> the primary key (which would lead to collisions if two things existed at
>>> the same point). I'd also like to avoid using the primary key as the
>>> geohash as if the item moves I'd have to change all the references to it.
>>> Lastly, I could do a less efficient mapreduce over a less precise geohash,
>>> but this doesn't solve the issue of the equator (anything near the equator
>>> would require mapreducing the entire dataset).
>>> >
>>> > Is there any way to query multiple ranges with a secondary index and
>>> pass that off to mapreduce? Or should I just stick with the less efficient
>>> mapreduce, and when near the equator, run two queries and later merge them?
>>> Or am I going about this the wrong way?
>>> >
>>> > In any case, the final stage of my queries will involve mapreduce as
>>> I'll need to further filter the items found in a region.
>>> >
>>> > Thank you,
>>> > Mark
>>> > _______________________________________________
>>> > riak-users mailing list
>>> > riak-users at
>>> >
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at
>> --
>> Sean Cribbs <sean at>
>> Software Engineer
>> Basho Technologies, Inc.
>> _______________________________________________
>> riak-users mailing list
>> riak-users at
> _______________________________________________
> riak-users mailing list
> riak-users at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list