Web doc buglet

Tin Le tin at le.org
Wed Oct 31 18:50:20 EDT 2012

For some reason, I did not get the original reply from Mark.  Only saw it
when it was included in Guido's email.

> HA proxy + Riak + ElasticSearch are your friends, Solr lacks
> documentation (way outdated), hard to find stuff done and samples, so if
> you have your cluster well setup and your meaning to do only key-value
> retrieval with assist of text index search using ElasticSearch, you are
> good.
> *Note:* We have Solr for GeoSpatial functionality and is amazingly fast,
> but there isn't much we can do, if you need complex polygon features it
> gets complicated in Solr. Except for some incubation projects that will
> be brought into Solr, it is kind of hard to do anything.

We are not using Solr at the moment, and I don't want to add yet another
piece into our infrastructure unless I really, really have to.  If it is
the best option for our needs, I'll use it.

> Hope that helps,
> Guido.

> On 31/10/12 17:15, Mark Phillips wrote:

>>> One of the thing we've found missing and really need is the geospatial
>>> indexing mongodb has.  We've just pushed our updated app to both
>>> iTunes
>>> and Google Playstore that uses this as an intrinsic part of our app.
>>> We
>>> decided to stay with mongo as there was no time to code up equivalent
>>> for
>>> riak.
>> So Riak isn't really a great fit for geospatial right now. You might
>> be able to fake it to some extent using secondary indexes and doing
>> range queries on lat/longs (stored as ints) but it might not be too
>> performant. The other thing worth noting is that Ryan Zezeski has been
>> hacking on a revised implementation of Riak Search that ties Riak and
>> Solr together quite nicely (and thus supports geospatial). It's called
>> Yokozuna [1], and it's still alpha, but it's worth looking at and
>> testing (as this code getting more stable pretty quickly).
>> What are the specifics of the use case?

Our app is a popular music recognition app for iOS, Android and other
mobile OS (over 100+ million users).  We just pushed out a feature that
allow users to discovered songs being "identified" by nearby users on a
map.  It work globally.  You can see what songs are being ID'ed around

As users ID'd a song, the song and user location is added to mongo.  This
info can be queried and display on a proximity map.

Since we were more familiar with mongo, and were under time pressure to
get this out, recoding would not work for us.

We will see how mongo scale for us over time for this particular feature.
 No rush to go to riak now.

I have new HW on order for intended production usage.  It wil be a 5
nodes cluster.  Each with 16 cores, 64GB RAM (upgradeable to 512GB), and
2TB in a RAID10 config.  If riak tested out, we will go with this HW. 
Otherwise, wipe and put mongo on it.

The test bed originally used 64 partitions with 3 nodes.  But as I was
adding 2 more nodes, I read somewhere in the doc that I should use 256 or
512 with more than 3 nodes.  So I wiped and upped it to 256 partitions on
5 nodes test bed.

Regarding the docs.  I would like to see a Best Practices section on
sizing HOWTOs and other configuration recommendations.  Perhaps a list of
usage scenarios and HW/SW configuration recs.  An explanation of why pick
a certain config and what need to be adjusted is helpful for

Current docs are fine for getting feet wet, just not enough for long
term, production usage.

I have more experience with mongo in a production environment than I have
with riak.  So anything that help make it easier for me  to get up to
speed is definitely good.

Right now, I am the sole person championing riak at $WORK$.  So it's
tough, heh :)


More information about the riak-users mailing list