riak + innostore

Lev Walkin vlm at lionet.info
Fri Feb 19 04:03:47 EST 2010


Hi,

We've found a performance problem with Riak 0.8 and Innostore,  
particularly on Amazon EC2 Small instances (10 nodes, n/w=3).

First, we noticed that innostore was slow accepting data:

> timer:tc(innostore_riak, put, [S1, {<<"siden">>, <<"key">>},  
> <<"value">>]).
> {8995645,ok}
> timer:tc(innostore_riak, put, [S1, {<<"siden">>, <<"key">>},  
> <<"value">>]).
> {4834159,ok}

Debugging showed that port_control was a culprit. Changing it to  
port_command (diff attached 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: innostore.diff
Type: application/octet-stream
Size: 30125 bytes
Desc: not available
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100219/4e032cda/attachment.diff>
-------------- next part --------------
) improved things considerably:

> timer:tc(innostore_riak,put,[S1, {<<"siden">>, <<"key1">>},  
> <<"value">>]
> {13899,ok}
> timer:tc(innostore_riak,get,[S1, {<<"siden">>, <<"key1">>}]).
> {86,{ok,<<"value">>}}
> timer:tc(innostore_riak,get,[S1, {<<"siden">>, <<"key">>}]).
> {90,{ok,<<"value">>}}
> timer:tc(innostore_riak,delete,[S1, {<<"siden">>, <<"key">>}]).
> {38700,ok}
> timer:tc(innostore_riak,delete,[S1, {<<"siden">>, <<"key1">>}]).
> {7299,ok}
> timer:tc(innostore_riak,get,[S1, {<<"siden">>, <<"key">>}]).
> {114,{error,notfound}}


After doing the patch, we've found an order of magnitude difference  
between calling innostore directly and using it as a riak backend.

Comparison of raw innostore vs riak at innostore.
Three types of requests were made: put, get and delete.
Each type was invoked 10000 times.

> {ok, S1} = innostore_riak:start(0, undefined).
> F = fun(0,_,_) -> ok; (N, F1, F2) -> F1(N), F2(N-1, F1, F2) end.
> C = term_to_binary(lists:duplicate(1000, $a)).
> FP = fun(N) -> innostore_riak:put(S1,{<<"siden">>,  
> term_to_binary(N)},C) end.
> FP2 = fun(N) -> innostore_riak:get(S1,{<<"siden">>,  
> term_to_binary(N)}) end.
> FP3 = fun(N) -> innostore_riak:delete(S1,{<<"siden">>,  
> term_to_binary(N)}) end.
>
> {ok, Cl} = riak:client_connect('riak at 127.0.0.1').
> FP4 = fun(N) -> Cl:put(riak_object:new(<<"siden">>,  
> term_to_binary(N), C),2) end.
> FP5 = fun(N) -> Cl:get(<<"siden">>, term_to_binary(N), 2) end.
> FP6 = fun(N) -> Cl:delete(<<"siden">>, term_to_binary(N), 2) end.
>
> ------- Direct to innostore -------
> -- PUT
> io:format("~p~n",[now()]),F(10000,FP,F),io:format("~p~n",[now()]).
> {1266,508427,932468}
> {1266,508431,242659}
> -- GET
> io:format("~p~n",[now()]),F(10000,FP2,F),io:format("~p~n",[now()]).
> {1266,508515,212371}
> {1266,508516,330781}
> -- DELETE
> io:format("~p~n",[now()]),F(10000,FP3,F),io:format("~p~n",[now()]).
> {1266,508533,218732}
> {1266,508535,38505}

As you see, an order of 2 seconds per 10k invocations (5000rps).

> ------- Riak -------
> -- PUT
> io:format("~p~n",[now()]),F(10000,FP4,F),io:format("~p~n",[now()]).
> {1266,523655,774894}
> {1266,523691,812606}
> -- GET
> io:format("~p~n",[now()]),F(10000,FP5,F),io:format("~p~n",[now()]).
> {1266,523818,225468}
> {1266,523829,169635}
> -- DELETE
> io:format("~p~n",[now()]),F(10000,FP6,F),io:format("~p~n",[now()]).
> {1266,523844,402019}
> {1266,523883,160529}

Here, an order of 10-40 seconds per 10k invocations (about 250  
requests per second on a 10-node cluster).

Keys were different for each invocation. The network latency is  
negligible enough not to be the case of the problem here: a simple  
rpc:call between two nodes makes tenths of thousand requests per second.

The question is why riak adds 10x overhead to its backend?

-- 
vlm



More information about the riak-users mailing list