How to cold (re)boot a cluster with already existing node data

Jan-Philip Loos maxdaten at gmail.com
Mon Jun 6 13:37:17 EDT 2016


On Mon, 6 Jun 2016 at 16:52 Alex Moore <amoore at basho.com> wrote:

> Hi Jan,
>
> When you update the Kubernates nodes, do you have to do them all at once
> or can they be done in a rolling fashion (one after another)?
>

Thnaks for your reply,

sadly this is not possible. Kubernetes with GKE just tears all nodes down,
creating new nodes with new kubernets version and reschedule all services
on these nodes. So after an upgrade, all riak nodes are stand-alone (when
starting after deleting /var/lib/riak/ring)

Greetings

Jan


> If you can do them rolling-wise, you should be able to:
>
> For each node, one at a time:
> 1. Shut down Riak
> 2. Shutdown/restart/upgrade Kubernates
> 3. Start Riak
> 4. Use `riak-admin force-replace` to rename the old node name to the new
> node name
> 5. Repeat on remaining nodes.
>
> This is covered in "Renaming Multi-node clusters
> <http://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/changing-cluster-info/#rename-multi-node-clusters>"
> doc.
>
> As for your current predicament,  have you created any new buckets/changed
> bucket props in the default namespace since you restarted? Or have you only
> done regular operations since?
>
> Thanks,
> Alex
>
>
> On Mon, Jun 6, 2016 at 5:25 AM Jan-Philip Loos <maxdaten at gmail.com> wrote:
>
>> Hi,
>>
>> we are using riak in a kuberentes cluster (on GKE). Sometimes it's
>> necessary to reboot the complete cluster to update the kubernetes-nodes.
>> This results in a complete shutdown of the riak cluster and the riak-nodes
>> are rescheduled with a new IP. So how can I handle this situation? How can
>> I form a new riak cluster out of the old nodes with new names?
>>
>> The /var/lib/riak directory is persisted. I had to delete the
>> /var/lib/riak/ring folder otherwise "riak start" crashed with this message
>> (but saved the old ring state in a tar):
>>
>> {"Kernel pid
>>> terminated",application_controller,"{application_start_failure,riak_core,{{shutdown,{failed_to_start_child,riak_core_broadcast,{'EXIT',{function_clause,[{orddict,fetch,['
>>> riak at 10.44.2.8
>>> ',[]],[{file,\"orddict.erl\"},{line,72}]},{riak_core_broadcast,init_peers,1,[{file,\"src/riak_core_broadcast.erl\"},{line,616}]},{riak_core_broadcast,start_link,0,[{file,\"src/riak_core_broadcast.erl\"},{line,116}]},{supervisor,do_start_child,2,[{file,\"supervisor.erl\"},{line,310}]},{supervisor,start_children,3,[{file,\"supervisor.erl\"},{line,293}]},{supervisor,init_children,2,[{file,\"supervisor.erl\"},{line,259}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,239}]}]}}}},{riak_core_app,start,[normal,[]]}}}"}
>>> Crash dump was written to: /var/log/riak/erl_crash.dump
>>> Kernel pid terminated (application_controller)
>>> ({application_start_failure,riak_core,{{shutdown,{failed_to_start_child,riak_core_broadcast,{'EXIT',{function_clause,[{orddict,fetch,['
>>> riak at 10.44.2.8',
>>
>>
>> The I formed a new cluster via join & plan & commit.
>>
>> But now, I discovered a problems with incomplete and inconsistent
>> partitions:
>>
>> *$ *curl -Ss "
>> http://riak.default.svc.cluster.local:8098/buckets/users/keys?keys=true"
>> | jq '.[] | length'
>>
>> 3064
>>
>> *$* curl -Ss "
>> http://riak.default.svc.cluster.local:8098/buckets/users/keys?keys=true"
>> | jq '.[] | length'
>>
>> 2987
>>
>> *$* curl -Ss "
>> http://riak.default.svc.cluster.local:8098/buckets/users/keys?keys=true"
>> | jq '.[] | length'
>>
>> 705
>>
>> *$* curl -Ss "
>> http://riak.default.svc.cluster.local:8098/buckets/users/keys?keys=true"
>> | jq '.[] | length'
>> 3064
>>
>> Is there a way to fix this? I guess this is caused by the missing old
>> ring-state?
>>
>> Greetings
>>
>> Jan
>>
> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160606/a26678e1/attachment-0002.html>


More information about the riak-users mailing list