riak-admin diag output

Jared Morrow jared at basho.com
Fri Jul 12 11:27:45 EDT 2013


Daniel,

The reason we recommend a 5 node cluster is the problem you are seeing.
With N=3 on a 3-node cluster you have no guarantees that your data will be
evenly spread around your cluster. We talk about this in the
documentation<http://docs.basho.com/riak/latest/references/appendices/Cluster-Capacity-Planning/#Number-of-Nodes>which
also points to a blog post on the topic. For a production environment
you should use a minimum of 5 nodes if your N value is 3.

So testing this out a little bit, here's what I see when I try to make a
3-node cluster:

$ ./dev1/bin/riak-admin cluster plan

<snip>
join           'dev2 at 127.0.0.1'
join           'dev3 at 127.0.0.1'
<snip>

WARNING: Not all replicas will be on distinct nodes

<snip>

So this warns not everything will be on distinct nodes.

Running diag as you ran it shows the same issue.

[warning] The following preflists do not satisfy the n_val: [[{0,

'dev1 at 127.0.0.1'},

{22835963083295358096932575511191922182123945984,

'dev2 at 127.0.0.1'},

{45671926166590716193865151022383844364247891968,

'dev2 at 127.0.0.1'}],
<snip> lots of output

Another good visualization of what is happening with your preflist is the
new ring page on riak_control. So with my N=3 and 3 node setup above, this
is what I see

http://i.imgur.com/fOcA5S1.png

If you don't use riak_control (this is to everyone) you really should try
it, lots of cool new stuff in 1.4.

So, if we add two other nodes, here's what happens.

$ ./dev1/bin/riak-admin cluster plan
<snip>
join           'dev4 at 127.0.0.1'
join           'dev5 at 127.0.0.1'
<snip>

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      34.4%     20.3%    'dev1 at 127.0.0.1'
valid      32.8%     20.3%    'dev2 at 127.0.0.1'
valid      32.8%     20.3%    'dev3 at 127.0.0.1'
valid       0.0%     20.3%    'dev4 at 127.0.0.1'
valid       0.0%     18.8%    'dev5 at 127.0.0.1'
-------------------------------------------------------------------------------
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Transfers resulting from cluster changes: 49

Note, there is no warning about distinct nodes this time.

Now we wait for transfers to complete:

$ ./dev1/bin/riak-admin transfers
'dev5 at 127.0.0.1' waiting to handoff 2 partitions
'dev4 at 127.0.0.1' waiting to handoff 3 partitions
'dev3 at 127.0.0.1' waiting to handoff 4 partitions
'dev2 at 127.0.0.1' waiting to handoff 3 partitions
'dev1 at 127.0.0.1' waiting to handoff 2 partitions
...
...
$ ./dev1/bin/riak-admin transfers
No transfers active

Okay, here's diag now:

./dev1/bin/riak-admin diag
[notice] Data directory
<snip>/Development/basho/1.4/riak/dev/dev1/bin/../data/bitcask is not
mounted with 'noatime'. Please remount its disk with the 'noatime'
flag to improve performance.

No more preflist warnings.

And finally riak-control:

http://i.imgur.com/rfLoUNK.png

So this was a long reply, but I just wanted to show the issue and how it is
corrected. You can get by with 4 nodes, but with one node failure you are
back down to the same 3-node problem you have now.

-Jared


On Fri, Jul 12, 2013 at 8:29 AM, Daniel Iwan <iwan.daniel at gmail.com> wrote:

> Hi my riak admin diag shows output as below(3-node cluster)
>
> I'm assuming long numbers are vnodes. Strange thing is:
>
> 5708990770823839524233143877797980545530986496 exist twice for the same
> node
>
> 19981467697883438334816003572292931909358452736 once on the list
>
> How do I interpret this?
>
> How can I list all vnodes and nodes that they are exist on, riak-admin
> vnode-status
>
> shows only primary locations, what about copies?
>
>
> me at node3:~$ riak-admin diag
>
> Attempting to restart script through sudo -H -u riak
>
> 14:19:51.277 [critical] vm.swappiness is 1, should be no more than 0)
>
> 14:19:51.277 [critical] net.core.wmem_default is 229376, should be at
> least 8388608)
>
> 14:19:51.278 [critical] net.core.rmem_default is 229376, should be at
> least 8388608)
>
> 14:19:51.278 [critical] net.core.wmem_max is 131071, should be at least
> 8388608)
>
> 14:19:51.278 [critical] net.core.rmem_max is 131071, should be at least
> 8388608)
>
> 14:19:51.278 [critical] net.core.netdev_max_backlog is 1000, should be at
> least 10000)
>
> 14:19:51.278 [critical] net.core.somaxconn is 128, should be at least 4000)
>
> 14:19:51.278 [critical] net.ipv4.tcp_max_syn_backlog is 2048, should be at
> least 40000)
>
> 14:19:51.278 [critical] net.ipv4.tcp_fin_timeout is 60, should be no more
> than 15)
>
> 14:19:51.278 [critical] net.ipv4.tcp_tw_reuse is 0, should be 1)
>
> 14:19:51.278 [warning] The following preflists do not satisfy the n_val:
> [[{0,'riak at 10.173.240.1
> '},{2854495385411919762116571938898990272765493248,'riak at 10.173.240.2
> '},{5708990770823839524233143877797980545530986496,'riak at 10.173.240.2
> '}],[{2854495385411919762116571938898990272765493248,'riak at 10.173.240.2
> '},{5708990770823839524233143877797980545530986496,'riak at 10.173.240.2
> '},{8563486156235759286349715816696970818296479744,'riak at 10.173.240.3
> '}],[{11417981541647679048466287755595961091061972992,'riak at 10.173.240.1
> '},{14272476927059598810582859694494951363827466240,'riak at 10.173.240.2
> '},{17126972312471518572699431633393941636592959488,'riak at 10.173.240.2
> '}],[{14272476927059598810582859694494951363827466240,'riak at 10.173.240.2
> '},{17126972312471518572699431633393941636592959488,'riak at 10.173.240.2
> '},{19981467697883438334816003572292931909358452736,'riak at 10.173.240.3
> '}],[{22835963083295358096932575511191922182123945984,'riak at 10.173.240.1
> '},{25690458468707277859049147450090912454889439232,'riak at 10.173.240.2
> '},{28544953854119197621165719388989902727654932480,'riak at 10.173.240.2
> '}],[{25690458468707277859049147450090912454889439232,'riak at 10.173.240.2
> '},{28544953854119197621165719388989902727654932480,'riak at 10.173.240.2
> '},{31399449239531117383282291327888893000420425728,'riak at 10.173.240.3
> '}],[{34253944624943037145398863266787883273185918976,'riak at 10.173.240.1
> '},{37108440010354956907515435205686873545951412224,'riak at 10.173.240.2
> '},{39962935395766876669632007144585863818716905472,'riak at 10.173.240.2
> '}],[{37108440010354956907515435205686873545951412224,'riak at 10.173.240.2
> '},{39962935395766876669632007144585863818716905472,'riak at 10.173.240.2
> '},{42817430781178796431748579083484854091482398720,'riak at 10.173.240.3
> '}],[{45671926166590716193865151022383844364247891968,'riak at 10.173.240.1
> '},{48526421552002635955981722961282834637013385216,'riak at 10.173.240.2
> '},{51380916937414555718098294900181824909778878464,'riak at 10.173.240.2
> '}],[{48526421552002635955981722961282834637013385216,'riak at 10.173.240.2
> '},{51380916937414555718098294900181824909778878464,'riak at 10.173.240.2
> '},{54235412322826475480214866839080815182544371712,'riak at 10.173.240.3
> '}],[{57089907708238395242331438777979805455309864960,'riak at 10.173.240.1
> '},{59944403093650315004448010716878795728075358208,'riak at 10.173.240.2
> '},{62798898479062234766564582655777786000840851456,'riak at 10.173.240.2
> '}],[{59944403093650315004448010716878795728075358208,'riak at 10.173.240.2
> '},{62798898479062234766564582655777786000840851456,'riak at 10.173.240.2
> '},{65653393864474154528681154594676776273606344704,'riak at 10.173.240.3
> '}],[{68507889249886074290797726533575766546371837952,'riak at 10.173.240.1
> '},{71362384635297994052914298472474756819137331200,'riak at 10.173.240.2
> '},{74216880020709913815030870411373747091902824448,'riak at 10.173.240.2
> '}],[{71362384635297994052914298472474756819137331200,'riak at 10.173.240.2
> '},{74216880020709913815030870411373747091902824448,'riak at 10.173.240.2
> '},{77071375406121833577147442350272737364668317696,'riak at 10.173.240.3
> '}],[{79925870791533753339264014289171727637433810944,'riak at 10.173.240.1
> '},{82780366176945673101380586228070717910199304192,'riak at 10.173.240.2
> '},{85634861562357592863497158166969708182964797440,'riak at 10.173.240.2
> '}],[{82780366176945673101380586228070717910199304192,'riak at 10.173.240.2
> '},{85634861562357592863497158166969708182964797440,'riak at 10.173.240.2
> '},{88489356947769512625613730105868698455730290688,'riak at 10.173.240.3
> '}],[{91343852333181432387730302044767688728495783936,'riak at 10.173.240.1
> '},{94198347718593352149846873983666679001261277184,'riak at 10.173.240.2
> '},{97052843104005271911963445922565669274026770432,'riak at 10.173.240.2
> '}],[{94198347718593352149846873983666679001261277184,'riak at 10.173.240.2
> '},{97052843104005271911963445922565669274026770432,'riak at 10.173.240.2
> '},{99907338489417191674080017861464659546792263680,'riak at 10.173.240.3
> '}],[{102761833874829111436196589800363649819557756928,'riak at 10.173.240.1
> '},{105616329260241031198313161739262640092323250176,'riak at 10.173.240.2
> '},{108470824645652950960429733678161630365088743424,'riak at 10.173.240.2
> '}],[{105616329260241031198313161739262640092323250176,...},...],...]
>
>
>
> Regards
>
> Daniel
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130712/11b0f8fd/attachment.html>


More information about the riak-users mailing list