Alternative to Post-Commit in EDS

Bogunov bogunov at gmail.com
Thu Apr 5 14:40:31 EDT 2012


Isn't riak-eds replication based on merkle-trees? Can`t riak provide some
hook which triggers then some leaf-node becomes synchronized ? So anybody
can just parse synchronized binary and retrieve keys from it ?

On Thu, Apr 5, 2012 at 1:37 AM, Anthony Molinaro <
anthonym at alumni.caltech.edu> wrote:

> Okay, so here's what I'm thinking now after reading through some of
> the M/R docs.  Suppose I did this.
>
> 1. Create 2 buckets
>   - one for K/V pairs
>   - one for changed keys keyed by a timestamp or bin or something
>     (run in post-commit on source colo).
> 2. Replicate both buckets to remote colo
> 2. Use a key filter with M/R to get keys changed from some time in the past
> 3. Run M/R regularly to publish key changes (probably to a rabbit queue)
> 4. Have local consumer read key changes then grab updated Values from first
> bucket
>
> I think this will all work, I'm not totally sure on the key filtering, but
> it seems like a second bucket with time based keys would work best.  I plan
> to serialize all writes to each bucket as that is a requirement for
> auditing
> so just having a single integer key with the time the entry was written
> will probably work, then a key filter with a simple greater than.  I can
> even overlap times to pick up any late additions caused by backups in
> replication, since I only keep track of changed keys, and always read
> the most current.  I guess you could end up with the timestamp based
> bucket replicating faster and thus data drift, hmm, that could be an issue.
>
> Maybe a secondary index with time might work better.  I believe I need
> some sort of secondary index as otherwise iterating over all the entries
> in a bucket would be costly.  I don't know exact numbers but I would guess
> I'm looking at worst case several million K/V pairs per bucket so maybe M/R
> on that isn't so bad.  Is there any speed up with 2i and a key filter (can
> you even create a key filter based on 2i?).
>
> Anyway, still searching for a way to do this efficiently,
>
> -Anthony
>
> On Wed, Apr 04, 2012 at 09:20:04AM -0700, Anthony Molinaro wrote:
> >
> > On Wed, Apr 04, 2012 at 08:10:29AM -0600, Jon Meredith wrote:
> > > Riak does have a last modified field, but it's last modified by client
> so
> > > is deliberately left untouched on replication. Similarly the vclock is
> not
> > > incremented either (the vclocks/siblings from both sides are resolved
> using
> > > the two vclocks).
> >
> > That's great, as I'd want to know on the far end when the client modified
> > it.
> >
> > > There are no obvious mechanisms for doing what you want currently.
>  I'll
> > > think about options and somebody will get back to you.
> >
> > Is it not possible to use the last modified filed in a Map/Reduce?  I've
> > not actually played with M/R in Riak yet (as I've only ever used it
> > previously as a Key/Value store).  I'll try to dig into it a bit today
> > but I assumed I could do something to map over all records in a bucket
> > checking last modified, and return the set modified since a certain
> > time (or better yet put them in a rabbit queue to be consumed by my
> > systems which will cache the data).
> >
> > Alternatively, I could maybe have a second bucket representing the
> changed
> > keys, where each time a key is changed in the primary bucket, I could
> > add an entry to the other bucket.  I could then replicate that bucket
> > and just list keys on the remote side (maybe also deleting so subsequent
> > list keys only get changes, but then I think the replicator will replace
> > those keys, so I'd have to have some sort of bidirectional replication
> > for those buckets, sounds messy).
> >
> > Anyway, hopefully someone will have an idea,
> >
> > -Anthony
> >
> > --
> > ------------------------------------------------------------------------
> > Anthony Molinaro                           <anthonym at alumni.caltech.edu>
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <anthonym at alumni.caltech.edu>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
email: bogunov at gmail.com
skype: i.bogunov
phone: +7 903 131 8499
Regards, Bogunov Ilya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120405/5fe3e454/attachment.html>


More information about the riak-users mailing list