Riak scenarios

Jeremiah Peschka jeremiah.peschka at gmail.com
Thu Aug 4 13:48:51 EDT 2011


This is just me theorizing, so someone can feel free to correct me:

Riak KV isn't good for time series databases because writes are random; there's no sequential access order.

Secondary indices in all databases increase query performance because there is a predictable order and structure making it quick to look up data. If Riak were using some kind of monolithic (non-distributed) index I could see this feature as being useful for time series applications.

However, since it looks like secondary indices are data local (See slide 53 http://www.slideshare.net/rklophaus/querying-riak-just-got-easier-introducing-secondary-indices), you'll still run into many of the same issues of non-data locality that you would run into trying to perform a MapReduce operation to only query 10% of your data - even with key filters you still have to scan all keys in memory.

TL;DR - indexes are hard, let's go shopping.

---
Jeremiah Peschka
Founder, Brent Ozar PLF, LLC

On Aug 4, 2011, at 8:50 AM, Les Mikesell wrote:

> On 8/4/2011 10:35 AM, Jeremiah Peschka wrote:
>> When I asked phark for such a document, he said:
>> 
>>    So, here is what we generally caution *against* using Riak for:
>> 
>>    1) It's not a graph database
>>    2) Time Series apps (it's doable but not optimal)
>>    2) Stuff that is analytics-heavy or requires a lot of adhoc queries.
> 
> Will the upcoming index support make it suitable for time series data with range queries?
> 
> -- 
>  Les Mikesell
>   lesmikesell at gmail.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list