Bash bench key/value generation strategies

Christian Dahlqvist christian at
Mon May 19 10:03:28 EDT 2014

Hi Simon,

If you have a limited set of object types and your data model allow you to
estimate and express the raw size of your data using one of the default
value generators, using one of the standard drivers, e.g.
basho_bench_driver_riakc_pb, is the easiest way to get a benchmark up and
running. For many scenarios where data in Riak is reasonably uniform, this
can result in a quite accurate model and quickly allow you to evaluate
different configurations and see how the cluster responds to different load
and size distributions. As Riak treats all values as binary blobs, it is
most often not necessary to create realistic objects. The only exception I
am aware of is if Riak Search is used and benchmarked.

The fact that Basho Bench writes to a single bucket is also, as long as you
have similar bucket properties for all buckets, usually not a problem
either as a bucket is just a namespace in Riak.

If your data model however is more complex, e.g. contains many different
objects types in buckets with different backends and properties, or you
require more accurate benchmarks, I would recommend creating a custom
driver based on one of the existing ones.

Best regards,


On Mon, May 19, 2014 at 2:13 PM, Simon Hartley <
Simon.Hartley at> wrote:

>  Hi,
> I’m interested in using Basho bench to do some performance testing of a
> Riak cluster we are proposing to use for an upcoming project.
> I’m interested in best practice  when configuring the basho bench
> properties file, specifically the key and value generation strategies.
> Is it best to get an estimate of the raw sizes of your keys and values and
> use one of the standard binary generators configured to create keys /
> values of approximately those sizes, or is it better to create custom
> generators to create actually representative keys / values?
> e.g. if we are intending to store values as JSON objects between 1 – 10K
> is there any significant benefit to creating a generator to create sample
> JSON objects  vs. using, for example, an exponential_bin generator
> configured for 1 – 10K?
> Also, the default behaviour is to place all keys during the test into a
> single bucket, obviously in practice we would be using many buckets, is
> there are significance with respect to performance testing to this
> difference?
> Many thanks for any assistance you can provide,
> Simon.
>  Confidentiality: The contents of this e-mail and any attachments
> transmitted with it are intended to be confidential to the intended
> recipient; and may be privileged or otherwise protected from disclosure. If
> you are not an intended recipient of this e-mail, do not duplicate or
> redistribute it by any means. Please delete it and any attachments and
> notify the sender that you have received it in error. This e-mail is sent
> by a William Hill PLC group company. The William Hill group companies
> include, among others, William Hill PLC (registered number 4212563),
> William Hill Organization Limited (registered number 278208), William Hill
> US HoldCo Inc, WHG (International) Limited (registered number 99191) and
> WHG Trading Limited (registered number 101439). Each of William Hill PLC,
> William Hill Organization Limited is registered in England and Wales and
> has its registered office at Greenside House, 50 Station Road, Wood Green,
> London N22 7TP. William Hill U.S. HoldCo, Inc. is 160 Greentree Drive,
> Suite 101, Dover 19904, Kent, Delaware, United States of America. Each of
> WHG (International) Limited and WHG Trading Limited is registered in
> Gibraltar and has its registered office at 6/1 Waterport Place, Gibraltar.
> Unless specifically indicated otherwise, the contents of this e-mail are
> subject to contract; and are not an official statement, and do not
> necessarily represent the views, of William Hill PLC, its subsidiaries or
> affiliated companies. Please note that neither William Hill PLC, nor its
> subsidiaries and affiliated companies can accept any responsibility for any
> viruses contained within this e-mail and it is your responsibility to scan
> any emails and their attachments. William Hill PLC, its subsidiaries and
> affiliated companies may monitor e-mail traffic data and also the content
> of e-mails for effective operation of the e-mail system, or for security,
> purposes..
> _______________________________________________
> riak-users mailing list
> riak-users at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list