Ensembles failing to reach "Leader ready" state

Jonathan Koff jonathan at projexity.com
Mon Mar 23 11:25:07 EDT 2015


Hi all,

I recently used Riak’s Strong Consistency functionality to get auto-incrementing IDs for a feature of an application I’m working on, and although this worked great in dev (5 nodes in 1 VM) and staging (3 servers across NA) environments, I’ve run into some odd behaviour in production (originally 3 servers, now 4) that prevents it from working.

I initially noticed that consistent requests were immediately failing as timeouts, and upon checking `riak-admin ensemble-status` saw that many ensembles were at 0 / 3, from the vantage point of the box I was SSH’d into. Interestingly, SSH-ing into different boxes showed different results. Here’s a brief snippet of what I see now, after adding a fourth server in a troubleshooting attempt:

*Machine 1* (104.131.39.61)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         0 / 3         3 / 3      --
    3         3 / 3         3 / 3      riak at 104.131.130.237
    4         3 / 3         3 / 3      riak at 104.131.130.237
    5         3 / 3         3 / 3      riak at 104.131.130.237
    6         0 / 3         3 / 3      --
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak at 104.131.130.237
    10        3 / 3         3 / 3      riak at 104.131.130.237
    11        0 / 3         3 / 3      --

*Machine 2* (104.236.79.78)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         3 / 3         3 / 3      riak at 104.236.79.78
    3         3 / 3         3 / 3      riak at 104.131.130.237
    4         3 / 3         3 / 3      riak at 104.131.130.237
    5         3 / 3         3 / 3      riak at 104.131.130.237
    6         3 / 3         3 / 3      riak at 104.236.79.78
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak at 104.131.130.237
    10        3 / 3         3 / 3      riak at 104.131.130.237
    11        3 / 3         3 / 3      riak at 104.236.79.78

*Machine 3* (104.131.130.237)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         0 / 3         3 / 3      --
    3         3 / 3         3 / 3      riak at 104.131.130.237
    4         3 / 3         3 / 3      riak at 104.131.130.237
    5         3 / 3         3 / 3      riak at 104.131.130.237
    6         0 / 3         3 / 3      --
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak at 104.131.130.237
    10        3 / 3         3 / 3      riak at 104.131.130.237
    11        0 / 3         3 / 3      --

*Machine 4* (162.243.5.87)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         3 / 3         3 / 3      riak at 104.236.79.78
    3         3 / 3         3 / 3      riak at 104.131.130.237
    4         3 / 3         3 / 3      riak at 104.131.130.237
    5         3 / 3         3 / 3      riak at 104.131.130.237
    6         3 / 3         3 / 3      riak at 104.236.79.78
    7         3 / 3         3 / 3      riak at 162.243.5.87
    8         3 / 3         3 / 3      riak at 162.243.5.87
    9         3 / 3         3 / 3      riak at 104.131.130.237
    10        3 / 3         3 / 3      riak at 104.131.130.237
    11        3 / 3         3 / 3      riak at 104.236.79.78


Interestingly, Machine 4 has full quora for all ensembles except for root, while Machine 3 only sees itself as a leader.

Another interesting point is the output of `riak-admin ensemble-status root`:

================================= Ensemble #1 =================================
Id:           root
Leader:       --
Leader ready: false

==================================== Peers ====================================
 Peer  Status     Trusted          Epoch         Node
-------------------------------------------------------------------------------
  1    (offline)    --              --           riak at 104.131.45.32
  2      probe      no              8            riak at 104.131.130.237
  3    (offline)    --              --           riak at 104.131.141.237
  4    (offline)    --              --           riak at 104.131.199.79
  5      probe      no              8            riak at 104.236.79.78
  6      probe      no              8            riak at 162.243.5.87

This is consistent across all 4 machines, and seems to include some old IPs from machines that left the cluster quite a while back, almost definitely before I’d used Riak's Strong Consistency. Note that the reason I added the fourth machine (104.131.39.61) was to see if this output would change, perhaps resulting in a quorum for the root ensemble.

For reference, here’s the status of a sample ensemble that isn’t “Leader ready”, from the perspective of Machine 2:
================================ Ensemble #62 =================================
Id:           {kv,1370157784997721485815954530671515330927436759040,3}
Leader:       --
Leader ready: false

==================================== Peers ====================================
 Peer  Status     Trusted          Epoch         Node
-------------------------------------------------------------------------------
  1    following    yes             43           riak at 104.131.130.237
  2    following    yes             43           riak at 104.236.79.78
  3     leading     yes             43           riak at 162.243.5.87


My config consists of riak.conf with:

strong_consistency = on

and advanced.config with:

[
  {riak_core,
    [
      {target_n_val, 5}
      ]},
  {riak_ensemble,
    [
      {ensemble_tick, 5000}
    ]}
].

though I’ve experimented with the latter in an attempt to get this resolved.

I didn’t see any relevant-looking log output on any of the servers.

Has anyone come across this before?

Thanks!

Jonathan Koff B.CS.
co-founder of Projexity
www.projexity.com <http://www.projexity.com/>

follow us on facebook at: www.facebook.com/projexity <http://www.facebook.com/projexity>
follow us on twitter at: twitter.com/projexity <http://twitter.com/projexity>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20150323/19a46de7/attachment-0002.html>


More information about the riak-users mailing list