Getting all the Keys

Eric Moritz eric at themoritzfamily.com
Sat Jan 22 15:28:43 EST 2011


I have a pipe dream of doing a distibuted b-tree with a bucket who's value
is a node in the tree containing a list of keys to another bucket and left
and right links  to children nodes.

It feels right in my head, though it is probably horribly flawed in some
way. I'm certain more clever data nerds than me have thought of this and
dismissed it for some obvious reason that I'm too green to see.

Eric.
On Jan 22, 2011 2:46 PM, "Gary William Flake" <gary at flake.org> wrote:
> This is a really big pain point for me as well and -- at the risk of
> prematurely being overly critical of Riak's overall design -- I think it
> points to a major flaw of Riak in its current state.
>
> Let me explain....
>
> Riak is bad at enumerating keys. We know that. I am happy to manage a list
> of keys myself. Fine. How do I do that in Riak?
>
> Well, the obvious solution is to have a special object that you maintain
> that is a list of the keys that you need. So, each time you insert a new
> object, you effectively append a new key to the end of a list, that is
> itself a value to a special index key.
>
> But what is an append in Riak? The only way to implement a list append is
> to:
>
> 1. read in the entire value of your list object.
> 2. append to this list at the application layer.
> 3. reinsert the new value back into the list.
>
> This is a horrible solution for at least three reasons. First, inserting N
> new keys and maintaining your own list is now O(N*N) runtime complexity
> because each append has to do I/O proportional to the size of the entire
> list for each append. Second, this operation should be happening entirely
> at the data layer and not between the data and app layer. Third, it
> introduces write contentions in that two clients may try to append at
> approximately the same time, giving you a list that is now inconsistent.
>
> The conclusion for me is that you can't efficiently enumerate keys with
Riak
> even if you roll your own key index with Riak (in anything close to an
ideal
> way).
>
> To overcome this problem, Riak desperately needs to either maintain its
own
> key index efficiently, or it needs to support atomic mutations on values.
>
> For an example of the latter approach, see Redis which I think handles
this
> beautifully.
>
> In the end, you may need to think about redesigning your data model so
that
> there never is a need to enumerate keys. I am trying this and I use a
> combination of:
>
> 1. Standard KV approaches,
> 2. Riak search for being able to enumerate some records in order,
> 3. Transactions logs stored in a special bucket,
> 4. Batched M/R phases on the Transaction logs to avoid write contention,
and
> 5. Batched rebuilding of "views" in a view bucket.
>
> Given that Riak search is loudly proclaimed as being beta, this makes me
> fairly anxious.
>
> I am very close to not needing to enumerate keys the bad way now. However,
> I would have killed for an atomic mutator like Redis.
>
> BTW, I would love for someone from Basho to disabuse me of my conclusions
in
> this note.
>
> -- GWF
>
>
>
>
>
>
>
> On Sat, Jan 22, 2011 at 10:40 AM, Alexander Staubo <lists at purefiction.net
>wrote:
>
>> On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <lists at purefiction.net>
>> wrote:
>> > On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
>> > <tburdick at wrightwoodtech.com> wrote:
>> >> So really whats the solution to just having a list of like 50k keys
that
>> can
>> >> quickly be appended to without taking seconds to then retrieve later
on.
>> Or
>> >> is this just not a valid use case for riak at all? That would suck
cause
>> >> again, I really like the notion of an AP oriented database!
>> >
>> > I have been struggling with the same issue. You may want to look at
>> > Cassandra, which handles sequential key range traversal very well.
>> > Riak also has a problem with buckets sharing the same data storage
>> > (buckets are essentially just a way to namespace keys), so if you have
>> > two buckets and fill up one of them, then enumerating the keys of the
>> > empty bucket will take a long time even though it
>>
>> I accidentally "Send". Again: I have been struggling with the same
>> issue. You may want to look at Cassandra, which handles sequential key
>> range traversal very well. Riak also has a problem with buckets
>> sharing the same data storage (buckets are essentially just a way to
>> namespace keys), so if you have two buckets and fill up one of them,
>> then enumerating the keys of the empty bucket will take a long time
>> even though it's empty. Cassandra does not have a problem with this,
>> since Cassandra's keyspaces are separate data structures. I like Riak,
>> but it only works well with single-key/linked traversal, not this kind
>> of bucket-wide processing.
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110122/89b83199/attachment.html>


More information about the riak-users mailing list