Question regarding backends

Bryan Fink bryan at basho.com
Wed Feb 17 09:13:35 EST 2010


Hi, Lev.  Ben has already given lots of good answers on this topic,
but there was just a little bit of clarification I wanted to add.

On Tue, Feb 16, 2010 at 5:56 AM, Lev Walkin <vlm at lionet.info> wrote:
> 2. The arch doc (http://riak.basho.com/arch.html) says that the backend
> needs to respond to "list keys" request, among others:
>        Each node may be configured with a different module for managing
> local storage. This module only needs to define "get", "put", "delete", and
> "list keys" functions that operate on binary blobs. The backend can consider
> these binaries completely opaque data, or examine them to make decisions
> about how best to store them.
>
> My question is whether it is efficient if a database has, say, several
> billion objects in it. It becomes unfeasible to allow the "list keys"
> operation to exectute. Under which circumstances this function is invoked?

The "list keys" operation mentioned in that arch doc is actually not
related to the "list keys" that the client can request.  The "list
keys" that is relevant here is the one used to build a merkle tree for
hinted handoff.

Indeed, you're right, for a parition with a very large number of keys
stored in it, such an operation could be extremely costly.  This is
why version 0.8 of Riak did away with building that merkle tree for
handoff.

That portion of that document is now incorrect, and instead of "list
keys", all backends are now required to implement "fold".  Switching
handoff to fold solves the problem you brought up by allowing an
incremental progression across the backend's data, instead of building
a giant structure all at once.

There is a separate "list keys" function required for backends to
implement (actually called "list bucket"), which enables the client
"list keys" request.  However, if an application never uses the client
"list keys" request, then the backend chosen for that Riak cluster is
not required to implement it (in contrast to "fold", which is required
for a Riak cluster to work, in order to do hinted handoff).

Hope that helps,
Bryan




More information about the riak-users mailing list