Can my pagination approach scale?

Anton theatilla at
Tue Jan 22 09:46:13 EST 2013

You can check roughly how well your approach will work with
basho_bench. If you estimate roughly how big your pages will be, set
up an appropriate benchmark and run it against the cluster or a
staging setup so you can get an idea of what performance you should

I don't think there's anything fundamentally wrong with your approach.
In fact I'm working on a similar storage scheme and I'm fairly
comfortable with it. You can find examples of real-world applications
in The Yammer
presentation, linked here,
also has similar ideas, it's worth checking out.

On 22 January 2013 14:56, Bach Le <thebullno1 at> wrote:
> Hi, I'm currently using Riak for my project. It works well for single
> documents, however I often need to present to users a stream of (loosely)
> time ordered documents, Riak's keys are unordered by nature so there's no
> straight forward way of traversing data. I came up with the following
> approach:
> Make a bucket (i.e: "pages"), set allow_mult to true. Inside this bucket
> store a number that points to the "current" page, this number is initialized
> to 0, I call this a cursor. For every "page" of data, create an object in
> the same bucket, e.g: first page is associated with the key page_0, second
> page: page_1 etc... These page objects are sets modeled using statebox for
> conflict resolution.
> When a document is inserted, read the cursor value. Since the cursor can
> only be increasing, we resolve conflicts by choosing the largest value among
> the siblings. Next, read the page that it points to (if cursor is 0, read
> the key "page_0", if it is 1, read "page_1" etc). If the number of objects
> inside this set exceeds the page size, increment the counter and create a
> new page to insert the object into, otherwise, leave the counter be and
> insert into this page.
> To retrieve data in reverse chronological order, read the cursor to find out
> the current page and then read the last page (which is shown to users as the
> first page).
> Currently, my document's ids are monotonically increasing using this:
> so I can sort documents within a page.
> I do realize that a page size can exceed its limit however, I don't know how
> badly it can be with respect to writing rate. All I need is some form of
> bulk get and chunking without resorting to 2i which can cover the whole
> cluster.
> So, is there any major problem with this approach? Thanks.
> _______________________________________________
> riak-users mailing list
> riak-users at

More information about the riak-users mailing list