2i for single-result lookup
nate at root.org
Mon Nov 7 23:31:15 EST 2011
I appreciate you sharing your design. But I can never understand why people go to such great lengths to add transactions to an eventually-consistent db.
You could possibly solve the same problem by using memcached or Redis. Example:
1. Look up the user in Riak. If that lookup succeeds, abort.
2. Store a copy of the new username via memcached add() with a generous expiry interval.
3. If it exists already (atomic in memcached), you lost a race with another user and error out. If not, also store the username in Riak.
4. The memcached key will expire at some point after Riak becomes consistent, keeping your memory use reasonable.
Is user creation such a high percentage of your transaction volume that you have to build something almost as complicated as a VCS on top of it in order to scale? For robustness, isn't it simpler to just disable user creation if your memcached single-point-of-failure is down and you weren't able to failover to the spare?
I'm not criticizing you since you may have some unique need I overlooked. But would the above work for you?
On Nov 7, 2011, at 7:16 PM, Justin Karneges wrote:
> Hmm, but if the username (or email) must be unique then I think this problem
> may be more than just indexing. There's also an id reservation issue.
> At least in my case, I did not think 2i would work for me, because I needed
> one of the indexes (email) to be unique and reservable on a first-come basis,
> yet I wanted people to be able to change email addresses. The resulting
> algorithm is a bit disgusting but it essentially works like this:
> 1) All keys are tagged with extra metadata:
> a) creation date (Riak only has modified date)
> b) random "value" id (I'm using a UUID for this). this is meant to indicate
> a particular generation of a key's value. updates to a key should retain this
> value, but overwrites should have new ids.
> 2) User keys (where the indexes point) must contain extra metadata: a set of
> index references (index creation date / value id pairs).
> The idea with #1 and #2 here is that indexes point to users and users point to
> indexes. Notably, users point to specific index key *versions* thanks to the
> extra metadata. This aids in conflict resolution (more on that later).
> 3) If you want to claim an index, then you first checks to see if a key already
> exists with that name. If it does then you can't claim ("account already
> exists"). If a key with that name does not exist then you make the index key
> with a link pointing to the user data key. Both the index key and user keys
> need to have creation date and value id metadata. Additionally, the user key
> needs to add a reference to this index into its set of references. Of course
> a conflict can occur if two clients try to claim indexes simultaneously.
> That's fine, you'll end up with two values (siblings) at the same index with
> different creation dates and value ids, and conflict resolution will deal with
> 4) When retrieving a user, you must confirm the relationship between the index
> and the object. Essentially a valid keypair is the earliest dated index
> sibling that points to a user key with the latest dated index reference that
> points back to it. Determining this validity requires retrieving both keys,
> regardless of which one you start from, and in certain conflict cases may
> require crawling through to other keys. If you cannot verify the relationship
> between index and object then you must treat this case as if neither exist.
> What does this all mean?
> You can create users, unique by some value, on a first come basis. If two
> clients try to create accounts with the same name at the same time, then an
> arbitrary one will win. If two existing users try to change to the same name,
> an arbitrary one will win, and the loser *will not lose his original
> reservation*. I mapped out some pretty awful race conditions like if three
> existing users (A, B, and C) exist and A and B suddenly try to change to the
> same name at the same moment C tries to change to B's original name, and B
> loses to A, then B is guaranteed to get his index back and C's claim becomes
> Oh, and finally, whichever index key is valid becomes the authoritative value
> of that data field. For example, in my case I'm indexing by email. This means
> if I want to look up a user's email address based on the direct user ID, I
> first look up the user object key directly, then I determine its valid index
> (needed to be done anyway just to confirm the user's validity to exist) and
> then use *that* index value as the user's email address. You would *not* use
> some email field value in the JSON blob. In fact, better to not have your
> index field in the JSON blob at all since conflict resolution won't be able to
> fix it.
> OH, and one last thing. There was concern about writing a user requiring two
> writes, opening up the potential of leaving behind orphan keys in the event of
> failure. I've started to come to the conclusion that this is acceptable, and
> you just need to sweep it up later. The above-described scheme will work fine
> even if you leave dirty keys all over the place and never resolve siblings.
> You'll never have half of a user or some dangling index or anything. The
> validity checks at read-time ensure this. But, some periodically run task
> that cleans up your DB with MapReduce operations would be smart.
> Maybe there's a better way to do this, but I thought I'd share.
> On Monday, November 07, 2011 06:19:48 PM Nate Lawson wrote:
>> Ok, then 2I will work fine for what you're doing if you're going to keep
>> adding fields like that. You can just use the binary type for the 2I keys
>> You first said you were concerned about the query performance due to the
>> covering set access and that you only had 2 unique fields. That's why I
>> suggested you create a manual index (either email or userid) to lookup the
>> other key. But with this additional info you mention, I think you should
>> just use 2I since that's one of its main purposes.
>> On Nov 7, 2011, at 6:11 PM, Greg Pascale wrote:
>>> Hi Nate,
>>> There are only 2 secondary keys for now (in addition to the primary key),
>>> but this number will grow to 5 or more pretty soon.
>>> I think when you say "insert each separately", you mean create 2
>>> duplicate objects, one keyed by username and one keyed by email. Or do
>>> you mean create one object keyed by username, and then another object
>>> containing the username and keyed by email (a manual index if you will)?
>>> Code complexity is the main reason I'd like to avoid a solution like
>>> this. Suddenly a user create operation requires n writes to be
>>> considered a success. If one fails, I need to delete the others, etc… It
>>> quickly becomes a pain.
>>> I don't know what you mean by "some relationship between the keys"
>>>> On Nov 7, 2011, at 5:45 PM, Greg Pascale wrote:
>>>>> I'm thinking about using 2i for a certain piece of my system, but I'm
>>>>> worried that the document-based partitioning may make it suboptimal.
>>>>> The issue is that the secondary fields I want to query over (email and
>>>>> username) are unique, so each will only ever map to one value. Since
>>>>> 2i queries a coverage set, but I'm only ever looking for one result,
>>>>> it's going to be hitting n-1 machines needlessly.
>>>>> So, what I'm looking to understand is how much overhead a single-result
>>>>> 2i lookup like this will incur vs. a primary-key lookup, or even vs.
>>>>> search. Search doesn't intuitively feel like the right tool here, but
>>>>> I wonder if it may actually be preferable since it uses term-based
>>>> If it's only 2 keys, why not insert each separately? You will double
>>>> your total number of keys in the db. But both search and 2I are
>>>> creating extra keys anyway in their private indices, so it has the same
>>>> or worse effect on total storage as doubling your primary keys. And
>>>> query efficiency is worse, as you point out.
>>>> 2I and search are more useful where there's some relationship between
>>>> the keys, not when they're fully independent as you point out.
More information about the riak-users