Consistent Riak (riak_ensemble) without Anti-entropy?

Sargun Dhillon sargun at sargun.me
Wed May 28 13:37:46 EDT 2014


I've gone ahead and opened up an issue on the riak_kv github project:
https://github.com/basho/riak_kv/issues/959 - as it appears like this
is an actual bug. My preference would be that when
riak_kv_ensemble_backend is booting up, it initiates manual AAE sync,
so I don't have to wait for AAE exchange to occur, as on a real
cluster, this might take a while.

On Wed, May 28, 2014 at 3:24 AM, Sargun Dhillon <sargun at sargun.me> wrote:
> So, I noticed that if I don't have anti-entropy on, and I enable
> strongly consistent Riak, it doesn't work. Specifically, what happens
> is that riak_kv_ensembles sets up the ensembles, but the
> riak_ensemble_peer never gets past to all_sync state. It appears that
> this is because the riak_kv_ensemble_backend relies on anti-entropy to
> perform an exchange before it comes up. See here, from
> riak_kv/develop:
>
>
> sync(Replies, State=#state{ensemble=_Ensemble, id=Id}) ->
>     Peers0 = [{Idx, PeerId} || {PeerId={{kv,_PL,_N,Idx},_Node},_Reply}
> <- Replies],
>     Peers = orddict:from_list(Peers0),
>     {{kv, PL, N, Idx}, _} = Id,
>     IndexN = {PL,N},
>     %% Sort to remove duplicates when changing ownership / forwarded response
>     Siblings0 = lists:usort([I || {{{kv,_PL,_N,I},_Node},_Reply} <- Replies]),
>     %% Just in case, remove self from list
>     Siblings = Siblings0 -- [Idx],
>
>     case local_partition(Idx) of
>         true ->
>             T0 = erlang:now(),
>             Pid = self(),
>             spawn_link(fun() ->
>                                wait_for_sync(Idx, IndexN, Pid, T0,
> Siblings, Peers)
>                        end),
>             {async, State};
>         false ->
>             {ok, State}
>     end.
>
> wait_for_sync(Idx, IndexN, Pid, T0, Siblings, Peers) ->
>     Exchanges = riak_kv_entropy_info:exchanges(Idx, IndexN),
>     Recent = [OtherIdx || {OtherIdx, T1, _} <- Exchanges,
>                           T1 > T0],
>     lager:info("~p/~p: Exchanges: ~p~nT0: ~p~nRecent: ~p~nSibs: ~p",
>                 [Idx, IndexN, Exchanges, T0, Recent, Siblings]),
>     Need = length(Siblings),
>     Finished = length(Recent),
>     Local = local_partition(Idx),
>     Complete = ((Siblings -- Recent) =:= []),
>     if not Local ->
>             lager:info("Partition ownership changed. No need to sync."),
>             riak_ensemble_backend:sync_complete(Pid, []);
>        Complete ->
>             lager:info("Complete ~b/~b :: ~p -> ~p~n", [Finished,
> Need, Idx, Pid]),
>             SyncPeers = [orddict:fetch(PeerIdx, Peers) || PeerIdx <- Siblings],
>             riak_ensemble_backend:sync_complete(Pid, SyncPeers);
>        true ->
>             lager:info("Not yet ~b/~b :: ~p", [Finished, Need, Idx]),
>             timer:sleep(10000),
>             wait_for_sync(Idx, IndexN, Pid, T0, Siblings, Peers)
>     end.
>
>
> (I uncommented the debugging). If riak_kv_entropy_manager is not
> enabled, then riak_kv_entropy_info:exchanges will always be empty. Can
> we either (1) manually trigger AAE exchange upon noticing that strong
> consistency is enabled (I imagine you can do this by setting the mode
> to manual, and then queueing up the AAE jobs), (2) throw a warning to
> the user saying that they should enable AAE.
>
> -Sargun




More information about the riak-users mailing list