how to think?

etnt kruskakli at
Mon Sep 7 08:51:40 EDT 2009

Thanx, for the reply! It has given me som food for thoughts.
There has lately been a lot of buzz around new DB implementations
using CAP-derived approaches, but I've seen nothing about how you
should migrate an existing DB (or even to structure a new non-naive 
DB from scratch). So I really appreciate your thoughts on this matter.

Cheers, Tobbe 

On Thu, 2009-09-03 at 23:29 -0400, Bryan Fink wrote:
> On Fri, Aug 28, 2009 at 3:27 AM, etnt<etnt at> wrote:
> > Being used to Mnesia tables and transactions, I'm having trouble
> > getting my head around how I should structure my DB if I were
> > to migrate it to something like Riak.
> >
> > I guess other people must have been in the same situation, so it
> > would be great if anyone could share some thoughts on this.
> I'm doing the same translation on BeerRiot.  BR is currently running
> on Mnesia: one table for beers, one for breweries, one for users...  I
> haven't finished the translation yet, but so far I've kept the object
> divisions the same: one Riak bucket for beers, one for breweries...
> The biggest difference between Mnesia tables and Riak buckets I see
> (ignoring transactions for the moment) is that Mnesia provides a
> dedicated method for retrieving all objects in a table matching some
> specification, while directly translating the same query to Riak would
> require code that more explicitly examined each object in a bucket.
> I've adapted not by direct translation, but instead by converting the
> concepts described by many of my Mnesia match specs into "links" among
> Riak objects.
> For instance, finding "approved" Comments on a Beer involves asking
> Mnesia for something like #comment{beer_id=BeerId, state=approved}.
> In Riak, though, it works better to keep a list of Comment IDs in the
> Beers they relate to, tagged with their state, and ask for a mapreduce
> from {beer, BeerId} to {link, comment, approved, ...}.
> The starting points can take some work.  For some objects it's obvious
> - start at the object for the user making the request, and walk from
> there to get objects that user has touched.  But, for other queries,
> like "site home" (or "all beers" in BeerRiot's case), some additional
> tracking system will be needed (I'll be using a combination of
> well-known-named "index" objects and temporary ets tables on BR).
> (Aside: actually, I was doing some of this linking stuff in Mnesia
> already anyway.  It's quite slick to run a listcomp full of {Table,
> Id} tuples through an mnesia:read.  Taking that intermediate step can
> be a good way to experiment with existing Mnesia data.)
> Now, about transactions -- my shortest answer: forget about
> transactions, and start thinking about "merging changesets".
> The approach I'm taking in BeerRiot is to "just write the
> modification", without worrying about whether things have changed
> between now and when I last fetched the object.  The reason I can do
> this is Riak's vclocks: they allow Riak to compare two versions of an
> object and determine whether one "descended" from the other.  That is,
> whether one was a subsequent modification to the other.
> If I store an object that descends from one already stored, Riak
> simple drops the old version.  But, when Riak finds that I've stored
> two different versions of an object, and that neither is a descendant
> of the other, it keeps them both.  The next time I ask for that
> object, Riak will hand me both versions.  At this point, I can decide
> what the proper "merged" object should look like.
> Merging is an interesting problem.  It can be simple (random choice,
> latest timestamp, lower user-id preferred, etc.) or difficult (think
> reimplementation of darcs), or anything in between - the application
> gets to decide.  I've had success with storing a small bit of metadata
> about what modifications were made to "this" version of the object, in
> terms of set-field, add-list-element, and remove-list-element.  For
> example, if I see that v1a set 'name' to 'foo' and v1b set 'size' to
> 'large', then v1merge should have both name=foo and size=large
> (conflicting changes obviously have a different strategy).  For simple
> data structures, and in cases where the Riak cluster is not
> *seriously* degraded (lots of nodes down, extreme network partition,
> etc.), this should be plenty to deal with the occasional concurrent
> modification, in my use cases.
> By the way, all of this merge talk assumes that the bucket property
> 'allow_mult' has been set to 'true'.  If 'allow_mult' is 'false' (as
> it is by default), a last-write-wins merge scheme is imposed before
> the client is given the object.
> To recap: my advice on "how to think" is to focus on the links among
> objects, and to devise a system for merging changesets.  (Imagine a
> big disclaimer here: this is only advice distilled from my one Mnesia
> -> Riak translation.  I'm absolutely positive I haven't covered all
> bases, and that there are many better solutions than what I've
> proposed. ;)
> -Bryan
> (P.S. I know we plan on discussing some strategies we've used at Basho
> in this thread as well ... now that we can all take a deep breath
> after releasing 0.4.)
> _______________________________________________
> riak-users mailing list
> riak-users at

More information about the riak-users mailing list