Problems writing objects to an half full bucket

David Smith dizzyd at basho.com
Tue Mar 6 10:19:04 EST 2012


On Mon, Mar 5, 2012 at 9:55 PM, Marco Monteiro <marco at textovirtual.com> wrote:

> I'm using riak-js and the error I get is:
>
> { [Error: socket hang up] code: 'ECONNRESET' }

That is a strange error -- are there any corresponding errors in
server logs? I would have expected a timeout or some such...

>
> UUIDs. They are created by Riak. All my queries use 2i. The 2i are integers
> (representing seconds) and random strings (length 16) used as identifiers
> for user sessions and similar.

So, this explains why the problem goes away when you switch to an
empty bucket. A bit of background...

If you're using the functionality in Riak that automatically generates
a UUID on PUT, you're going to get a uniformly distributed 160-bit
number (since the implementation SHA-1 hashes the input). This sort of
distribution is great for uniqueness, since there is a 1 in 2^160
chance (roughly) that you will encounter another similar ID. It can be
very bad from a caching perspective, however, if you have a cache that
uses pages of information for locality purposes. In a scheme such as
this (which is what LevelDB uses), the system will wind up churning
the cache constantly since the odds are quite low that the next UUID
to be accessed will be already in memory (remember, uniform
distribution of keys).

LevelDB also makes this pathological case a bit worse by not having
bloom filters -- when inserting a new UUID, you will potentially have
to do 7 disk seeks just to determine if the UUID is not present. The
Google team is working to address this problem, but I'm guessing it'll
be a month or so before that's done and then we have to integrate with
Riak -- so we can't count on that just yet.

Now, all is not lost. :)

If you craft your keys so that there is some temporal locality _and_
the access pattern of your keys has some sort of exponential-ish
decay, you can still get very good performance out of LevelDB. One
simple way to do this is to prefix the current date-time on front of
the UUID, like so:

201203060806-<uuid> (YMDhm-UUID)

You could also use seconds since the epoch, etc. This has the effect
of keeping recently accessed/hot UUIDs on (close to) the same cache
page, and lets you avoid a lot of cache churn and typically
dramatically improves LevelDB performance.

Does this help/make sense?

D.
-- 
Dave Smith
VP, Engineering
Basho Technologies, Inc.
dizzyd at basho.com




More information about the riak-users mailing list