Bitcask & Innostore Benchmark @ EC2

James Sadler freshtonic at gmail.com
Sun Jul 11 23:52:13 EDT 2010


Hi All,

I've been benchmarking Riak using basho_bench on an EC2 m1.large
instance and also running locally on my iMac inside VirtualBox to
assess performance of the Bitcask and Innostore backends

The test configuration looks like this:

{mode, max}.
{duration, 10}.
{concurrent, 50}.
{driver, basho_bench_driver_http_raw}.
{code_paths, ["deps/stats",
              "deps/ibrowse"]}.
%% a composite key composed of 4 IDs, each of which is a 16 char hex string,
%% specific to our data model.
{key_generator, {random_dynamo_style_string, 35000}}.
{value_generator, {fixed_bin, 1000}}.
{operations, [{get, 1}, {update, 2}]}.
{http_raw_ips, ["127.0.0.1"]}.

Also, to generate the 'random_dynamo_style_string', I made the
following changes to the bash_bench source:

diff --git a/src/basho_bench_keygen.erl b/src/basho_bench_keygen.erl
index 4849bbe..639e90b 100644
--- a/src/basho_bench_keygen.erl
+++ b/src/basho_bench_keygen.erl
@@ -54,6 +54,13 @@ new({pareto_int, MaxKey}, _Id) ->
 new({pareto_int_bin, MaxKey}, _Id) ->
     Pareto = pareto(trunc(MaxKey * 0.2), ?PARETO_SHAPE),
     fun() -> <<(Pareto()):32/native>> end;
+new({random_dynamo_style_string, MaxKey}, _Id) ->
+    fun() -> lists:concat([
+                    get_random_string(16, "0123456789abcdef"), "-",
+                    get_random_string(16, "0123456789abcdef"), "-",
+                    get_random_string(16, "0123456789abcdef"), "-",
+                    get_random_string(16, "0123456789abcdef")])
+    end;
 new(Other, _Id) ->
     ?FAIL_MSG("Unsupported key generator requested: ~p\n", [Other]).

@@ -74,10 +81,17 @@ dimension({pareto_int, _}) ->
     0.0;
 dimension({pareto_int_bin, _}) ->
     0.0;
+dimension({random_dynamo_style_string, MaxKey}) ->
+    0.0;
 dimension(Other) ->
     ?FAIL_MSG("Unsupported key generator dimension requested: ~p\n", [Other]).

-
+get_random_string(Length, AllowedChars) ->
+    lists:foldl(fun(_, Acc) ->
+                        [lists:nth(random:uniform(length(AllowedChars)),
+                                   AllowedChars)]
+                            ++ Acc
+                end, [], lists:seq(1, Length)).


 %% ====================================================================


As of now, I've only been running the benchmark with a 'cluster' of
one single Riak node, and I have benchmarked with bitcask and
innostore backends on the latest version of Riak (0.11.0-1344) and
innostore (1.0.0-88) on Ubuntu Lucid.  I have also been running the
basho_bench on the same host as the Riak node.

The benchmarks are showing very high get and update latencies in the
95th percentile and beyond when using Innostore as the backend.
Bitcask performance is much better.

While running the benchmarks, I had an iostat process reporting IO
every 1 second.  It clearly showed heavy writes during the benchmark,
but practically zero reads.  I expect that this was because of the
disk cache.  What I found very surprising was that the latencies for
innostore gets during the benchmark were very high, even though the
disk was not being hit for reads at all.  This was reproducible on
both EC2 and on a local VM on my iMac.

Observations:

## Innostore backend

- Latencies for innostore are high across the board. Even for reads,
__when iostat is reporting that no reads are hitting the disk__.

- 95th percentile read/write latencies are up to 1500/2000 millis.

- 99th percentile reads/writes are up to 2000/4000 millis.

- The difference between update and get latency is small.

- There are some failed (timeouts) updates/gets in log file produced
by basho_bench

- Throughput with innostore backend is 150-200 req/sec

- Mounting the filesystem with noatime doesn't seem to make much of a
difference.

- Getting values from disk cache has huge latency (iostat reporting no
reads on the device).  This is somewhat bizarre.

## Bitcask backend

- There are zero errors in the log produced by bash_bench (no timeouts
like with innostore)

- Throughput is much higher: 620 req/sec

- The 99.9th percentile latencies are 200ms for writes, and for reads 100ms

- The 95th percentile latencies are 160ms for writes and 60ms for reads

- Mean & median latencies are 115ms for writes and 20ms for reads.

Summary charts from basho_bench are attached.


NOTE:

I haven't included benchmark results from my own local VM.  FWIW, I
observed the approximately same characteristics in the EC2 and local
benchmarks.

In summary, it looks like there are significant performance issues
with Innostore in terms of throughput and latency.  Latency is the
biggest issue for our ad serving product at Lexer, so it looks like
we'll be using Bitcask in production.

Hopefully these results will be useful to others.

Also, given our large key size of 67 characters, combined with the
Bitcask's padding and storage layout, how many keys should we be able
to manager per node per GB?

Looking forward to any comments.

Thanks.

-- 
James
-------------- next part --------------
A non-text attachment was scrubbed...
Name: summary_bitcask_ec2.png
Type: image/png
Size: 58240 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100712/f9f39066/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: summary_innostore_ec2.png
Type: image/png
Size: 63632 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100712/f9f39066/attachment-0001.png>


More information about the riak-users mailing list