First sketch of a benchmark DSL for riak

Martin Scholl ms at diskware.net
Wed Oct 7 05:20:25 EDT 2009


Hello all,


to better estimate the performance of riak and to see how well it
scales, I've implemented a tiny DSL for doing distributed load-tests.
The code is available in branch "benchmark" at bitbucket:
	http://bitbucket.org/zeitgeist/riak/changeset/f6f7bca823dd/


The "DSL" is expressed in erlang terms. Typically, tests are written in
files which are then file:consult/1'd and executed by the module
"riak_benchmark".

To keep it short, some examples:

{benchmark, 'riak at 10.100.4.2',
 [{wr, 1000, 100}]
}.

means: "benchmark riak node riak at 10.100.4.2, using benchmark-method wr".
Method "{wr, N, M}" means a write-read-test, that is N writes are
followed by M reads. There is another method called "{rw, N, M}", which
does the same but reads first (N many), then writes (M many).

Another example (see benchmarks/examples/pmap_dist.benchmark):

{benchmark, 'riak at 10.100.4.2',
 {pmap,
  [
   {at, 'bm at 10.100.4.7', [{wr, 10000}]},
   {at, 'bm at 10.100.4.8', [{wr, 10000}]},
   {at, 'bm at 10.100.4.9', [{wr, 10000}]}
  ]
 }
}.

means: benchmark riak node riak at 10.100.4.2', and execute the listed
tests in parallel. Each subtest is run at another node ('bm at 10.100.4.7
and so on).

The dsl is recursive, that is you could also write this (warning:
untested, syntax error in it probably):

{pmap,
 [{at, 'bm at 10.100.4.7', [{benchmark, 'riak at 10.100.4.7', {wr, 10000}}]},
  {at, 'bm at 10.100.4.8', [{benchmark, 'riak at 10.100.4.8', {wr, 10000}}]}
 ]
}.

which executes parallel tests connecting to a local node each.

Or do something like this to use 2 nodes and execute 4 write-tests in
parallel

{pmap,
 [{at, 'bm at 10.100.4.6', [
	{pmap,[
		{benchmark, 'riak at 10.100.4.7', {wr, 10000}},
		{benchmark, 'riak at 10.100.4.8', {wr, 10000}}
	 ]}]},
  {at, 'bm at 10.100.4.9', [
	{pmap,[
		{benchmark, 'riak at 10.100.4.7', {wr, 10000}},
		{benchmark, 'riak at 10.100.4.8', {wr, 10000}}
	 ]}]}
 ]
}.

(again, example is untested, contains syntax errors probably)


A lot of things are missing:
- output more than just to /dev/stdout, e.g.:
    - JUnit-compatbiel XML-code for doing automatic analysis
      on the benchmark / tests' results (-> Hudson-Integration)
    - output CSV, etc. for doing stat. analysis on the results
- automatically launch slaves via module slave
- automatically distribute the code of the benchmark module (and
depending modules) instead of manuel distribution
- write a proper documentation
- add an escript to launch tests from the command-line
- ...

The idea is, to extend the dsl with more and richer tests and to have a
testing-harness with proper distributed semantics and testing-methods.

Furthermore, an automatic deployment and testing method would be great
to have, e.g. something like this:
./start-dist-test <test directory> <nodename 1> <nodename 2> [...]

which deploys and boots a fresh riak cluster to nodes [<nodename 1>
...], and then executes the tests in <test directory>.

We do something like this here and got great results.


Would be great to see the DSL be integrated in riak.

HIH,
Martin




More information about the riak-users mailing list