Is Riak appropriate for website metrics?
aphyr at aphyr.com
Mon Nov 28 17:24:52 EST 2011
For limited mapreduce (where you know the keys in advance) riak would be
a fine choice. 500 million keys, n val 3 is readily achievable on
commodity hardware; say four nodes with 128GB SSDs.
If large-scale mapreduce (more than a few hundred thousand keys) is
important, or listing keys is critical, you might consider HBase.
If you start hitting latency/write bottlenecks, it may be worth
accumulating metrics in Redis before flushing them to disk.
At Showyou, we're also building a custom backend called Mecha which
integrates Riak and SOLR, specifically for this kind of analytics over
billions of keys. We haven't packaged it for open-source release yet,
but it might be worth talking about off-list.
On 11/28/2011 02:07 PM, Michael Dungan wrote:
> Sorry if this has been asked before - I couldn't find a searchable
> archive of this list.
> I was told to ask this list whether or not Riak would be appropriate for
> tracking our site's metrics. We are currently using Redis for this but
> are at the point where we need both clustering and m/r capability, and
> on the surface, Riak looks to fit this bill (we already use Erlang
> elsewhere in our app, so that's an additional plus).
> The records are pretty small and can be representated easily in json. An
> "id": "c4473dc5cfc5da53831d47c4c016d1c7de0a31e4fd94229e47ade569ef011a7b"
> "type": "Photo::Click",
> "user_id": 2640,
> "photo_id": 255,
> "ip": "100.101.102.103",
> "created_at": "2011/04/08 17:09:40 -0700"
> We currently have around 25 million records similar to this one, and are
> adding 4-5 million more each month.
> Is Riak appropriate for this use case? Are there any gotchas I need to
> be aware of?
> thank you,
More information about the riak-users