Limiting the number of entries in a set

Russell Brown russell.brown at me.com
Sat Jan 31 14:47:49 EST 2015


Hi Shawn,

If I can get in IRC later I will.

On 31 Jan 2015, at 18:11, Shawn Debnath <shawn at debnath.net> wrote:

> Good morning,
> 
> Wondering if there is a better way to do truncation of a set if entries exceed a certain threshold. I am trying to create a canonical timeline for every user in our system, set of UUID strings. We want the number of entries to be limited to about 1000 per timeline to keep the size of each KV object in check inside Riak and also not store unnecessary data. From looking around code and docs, there doesn’t seem to be a Redis style LTRIM option available. So is the only way to enforce this is to fetch the set, check count, trim it locally and then update?

At the moment, yes that is correct: no LTRIM, you need to fetch and send back a {remove_all, [elements, to, remove]} type operation (that was pseudo code.)

> 
> My fear is that by using CRDTs, a set add wins over remove in the case of a conflict. If there were to be 1000 conflicting concurrent updates each trying to trim and add one entry to a set that already contains 1000 entries, I imagine the  final count of the set would be 2000. This can be greatly exasperated on a very “friendly” or popular person’s timeline and the code never being able to truncate the set back to 1000.

That is not how the Set that we ship with riak 2.0 works. You don’t fetch the object, mutate it, and store it. Instead you fetch a context and the values. You then send operations to riak. If concurrent actors fetch the set and remove the same, or some intersection, or even disjoint set of elements, whilst adding others, all those operations are executed. The “add wins” applies only to a per-element conflict: client A adds element X while client B removes element X. In this case, add wins and X is in the set. However, you are correct, if 1000 actors concurrently remove every element from a set with 1000 elements, and each add there own unique element, the set will have 1k entries again. There is no way, at the moment in Riak, to enforce an invariant, or bound, on the size of the set.

There is some recent academic work by Valter Balegas, and Nuno Pregucia, (and I am sure others), on this subject, where they created bounded CRDTs, but I can’t find the reference at the moment. It’s something we’d like in Riak, in future.

If you want a strong invariant on the size of a set, you could use strong consistency in Riak, maybe?

Russell

> 
> Any thoughts or suggestions appreciated.
> 
> Thanks,
> 
> Shawn
> freenode (sde)
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list