Is Riak appropriate for website metrics?

Aphyr aphyr at aphyr.com
Mon Nov 28 18:18:20 EST 2011


Sure.

To clarify, Riak mapreduce is decent. We store hundreds of millions of 
objects without trouble, and mapreduce over hundreds for many requests 
with decent (50-500ms) latencies.

It's just not the best for job over millions of keys; it will take much 
longer than a comparable job implemented in, say, Hadoop. It's also 
difficult to debug MR in riak--but it's difficult to debug Hadoop as 
well. If either *could* work, the answer probably falls down to "do you 
have the man-hours and expertise necessary to keep hadoop happy".

Riak can also collapse in horrible ways when asked to list huge numbers 
of keys. Some people say it just gets slow on their large installations. 
We've actually seen it hang the cluster altogether. Try it and find out! 
Basho understands this and is aiming to address it, but I've heard no 
specific timetable or plans. Meanwhile we pull keys out of the 
underlying storage directly, and cache them in Redis. That may be a 
viable solution for you.

Mecha is something experimental that John Mullerleile is working on.

http://www.slideshare.net/jmuellerleile/scaling-with-riak-at-showyou

Basically, it's a new backend for Riak (if you weren't aware, Riak has 
pluggable storage backends). You still read and write to Riak as normal, 
but underneath the hood, it stores the data in leveldb (one per 
partition per vnode), and *also* indexes specially named fields in a 
local solr core on each node. Using the coverage code in Riak 1.0, we 
can then issue a solr query to some subset of nodes and receive a 
response for all the values stored in Riak. You can filter, count, 
facet, etc by text, numbers, multivalued texts, geolocation, etc. I 
would describe it as "scary fast".

Downside is it's also experimental, and glues together a lot of 
different technologies. All those moving parts means we haven't had time 
to package it up and open-source it yet, but sometime in December or 
January we're hoping to focus on polish and release.

--Kyle

On 11/28/2011 02:59 PM, Michael Dungan wrote:
> Thank you for getting back to me. It does look like we'll be needing to
> go big, as we're already at 5m new records/month, so just dealing with
> monthly numbers is already beyond the few hundred thousand keys you
> mentioned, unless I'm thinking about this wrong.
>
> I would love to hear more about Mecha if you're willing to share. Feel
> free to contact me off-list.
>
> thanks again,
>
> -mike
>
>
> On 11/28/11 2:24 PM, Aphyr wrote:
>> For limited mapreduce (where you know the keys in advance) riak would be
>> a fine choice. 500 million keys, n val 3 is readily achievable on
>> commodity hardware; say four nodes with 128GB SSDs.
>>
>> If large-scale mapreduce (more than a few hundred thousand keys) is
>> important, or listing keys is critical, you might consider HBase.
>>
>> If you start hitting latency/write bottlenecks, it may be worth
>> accumulating metrics in Redis before flushing them to disk.
>>
>> At Showyou, we're also building a custom backend called Mecha which
>> integrates Riak and SOLR, specifically for this kind of analytics over
>> billions of keys. We haven't packaged it for open-source release yet,
>> but it might be worth talking about off-list.
>>
>> --Kyle
>>
>> On 11/28/2011 02:07 PM, Michael Dungan wrote:
>>> Hi,
>>>
>>> Sorry if this has been asked before - I couldn't find a searchable
>>> archive of this list.
>>>
>>> I was told to ask this list whether or not Riak would be appropriate for
>>> tracking our site's metrics. We are currently using Redis for this but
>>> are at the point where we need both clustering and m/r capability, and
>>> on the surface, Riak looks to fit this bill (we already use Erlang
>>> elsewhere in our app, so that's an additional plus).
>>>
>>> The records are pretty small and can be representated easily in json. An
>>> example:
>>>
>>> {
>>> "id": "c4473dc5cfc5da53831d47c4c016d1c7de0a31e4fd94229e47ade569ef011a7b"
>>> "type": "Photo::Click",
>>> "user_id": 2640,
>>> "photo_id": 255,
>>> "ip": "100.101.102.103",
>>> "created_at": "2011/04/08 17:09:40 -0700"
>>> }
>>>
>>> We currently have around 25 million records similar to this one, and are
>>> adding 4-5 million more each month.
>>>
>>> Is Riak appropriate for this use case? Are there any gotchas I need to
>>> be aware of?
>>>
>>> thank you,
>>>
>>> -mike
>




More information about the riak-users mailing list