[Basho Riak] Fail To Update Document Repeatly With Cluster of 5 Nodes

Russell Brown russell.brown at mac.com
Tue Feb 7 05:37:59 EST 2017


On 7 Feb 2017, at 10:27, my hue <tranmyhue.grackle at gmail.com> wrote:

> Dear Russell,
> 
> Yes, I updated all registers in one go.
> And I do not try yet with updating a single register at a time.
> let me try to see.  But I wonder that any affect on solving conflict at riak cluster 
> if update all in one go? 
> 

Just trying to make the search space as small as possible. I don’t think _any_ of this should fail. The maps code is very well tested and well used, so it’s all kind of odd.

Without hands on it’s hard to debug, and email back and forth is slow, so if you try the simplest possible thing and that still fails, it helps.

IMO the simplest possible thing is to start with a new, empty key and use modify_type to update a single register.

Many thanks

Russell

> 
> 
> On Tue, Feb 7, 2017 at 5:18 PM, Russell Brown <russell.brown at mac.com> wrote:
> So in you’re updating all those registers in one go? Out of interest, what happens if you update a single register at a time?
> 
> On 7 Feb 2017, at 10:02, my hue <tranmyhue.grackle at gmail.com> wrote:
> 
> > Dear Russel,
> >
> > > Can you run riakc_map:to_op(Map). and show me the output of that, please?
> >
> > The following is output of riakc_map:to_op(Map) :
> >
> > {map, {update, [{update, {<<"updated_time_dt">>,register},{assign,<<"2017-02-06T17:22:39Z">>}}, {update,{<<"updated_by_id">>,register}, {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{update,{<<"status_id">>,register},{assign,<<"show">>}},{update,{<<"start_time">>,register},{assign,<<"dont_use">>}},{update,{<<"restaurant_status_id">>,register}, {assign,<<"inactive">>}}, {update,{<<"restaurant_id">>,register}, {assign,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{update,{<<"rest_location_p">>,register}, {assign,<<"10.844117421366443,106.63982392275398">>}}, {update,{<<"order_i">>,register},{assign,<<"0">>}}, {update,{<<"name">>,register},{assign,<<"fullmenu">>}}, {update,{<<"menu_category_revision_id">>,register}, {assign,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}}, {update,{<<"maintain_mode_b">>,register},{assign,<<"false">>}}, {update,{<<"id">>,register}, {assign,<<"menufe89488afa948875cab6b0b18d579f21">>}}, {update,{<<"end_time">>,register},{assign,<<"dont_use">>}},{update,{<<"currency">>,register},{assign,<<"cad">>}}, {update,{<<"created_time_dt">>,register}, {assign,<<"2017-01-27T03:34:04Z">>}}, {update,{<<"created_by_id">>,register}, {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}}, {update,{<<"account_id">>,register}, {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}}]}, <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>}
> >
> >
> >
> >
> > On Tue, Feb 7, 2017 at 4:36 PM, Russell Brown <russell.brown at mac.com> wrote:
> >
> > On 7 Feb 2017, at 09:34, my hue <tranmyhue.grackle at gmail.com> wrote:
> >
> > > Dear Russell,
> > >
> > > >What operation are you performing? What is the update you perform? Do you set a register value, add a register, remove a register?
> > >
> > > I used riakc_map:update to update value with map. I do the following steps :
> > >
> > > - Get FetchData map with  fetch_type
> > > - Extract key, value, context from FetchData
> > > - Obtain UpdateData with:
> > >
> > > + Init map with context
> >
> > I don’t understand this step
> >
> > > + Use :
> > >
> > >    riakc_map:update({K, register}, fun(R) -> riakc_register:set(V,  R) end,  InitMap)
> > >
> > > to obtain UpdateData
> > >
> > > Note:
> > > K : key
> > > V:  value
> > >
> > > - Then  update UpdateData with update_type
> > >
> >
> > Can you run riakc_map:to_op(Map). and show me the output of that, please?
> >
> > > The following is sample about Update data :
> > >
> > > {map, [] ,
> > >  [{{<<"account_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_time_dt">>,register},{register,<<>>,<<"2017-01-27T03:34:04Z">>}},{{<<"currency">>,register},{register,<<>>,<<"cad">>}},{{<<"end_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"id">>,register},{register,<<>>,<<"menufe89488afa948875cab6b0b18d579f21">>}},{{<<"maintain_mode_b">>,register},{register,<<>>,<<"false">>}},{{<<"menu_category_revision_id">>,register},{register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},{{<<"name">>,register},{register,<<>>,<<"fullmenu">>}},{{<<"order_i">>,register},{register,<<>>,<<"0">>}},{{<<"rest_location_p">>,register},{register,<<>>,<<"10.844117421366443,106.63982392275398">>}},{{<<"restaurant_id">>,register},{register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{{<<"restaurant_status_id">>,register},{register,<<>>,<<"inactive">>}},{{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"status_id">>,register},{register,<<>>,<<"show">>}},{{<<"updated_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"updated_time_dt">>,register},{register,<<>>,<<"2017-02-06T17:22:39Z">>}}],
> > >  [] ,  <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>
> > > }
> > >
> > >
> > > On Tue, Feb 7, 2017 at 3:43 PM, Russell Brown <russell.brown at mac.com> wrote:
> > >
> > > On 7 Feb 2017, at 08:17, my hue <tranmyhue.grackle at gmail.com> wrote:
> > >
> > > > Dear John and Russell Brown,
> > > >
> > > > * How fast is your turnaround time between an update and a fetch?
> > > >
> > > > The turnaround time between an update and a fetch about 1 second.
> > > > During my team and I  debug, we adjusted haproxy with the scenario as follow:
> > > >
> > > > Scenario 1 : round robin via 5 nodes of cluster
> > > >
> > > > We meet issue at scenario 1 and we are afraid of that timeout can be occurs between nodes,
> > > > make us still get stale data. Then we performed scenario 2
> > > >
> > > > Scenario 2:  Disable round robin and only route request to node 1. Cluster still is 5 nodes.
> > > > With this case we ensure that request update and fetch always come to and from node 1.
> > > > And the issue still occurs.
> > > >
> > > > At the fail time, I hoped that can get any error log from riak nodes to give me any information.
> > > > But riak log show to me nothing and everything is ok.
> > > >
> > > > * What operation are you performing?
> > > >
> > > > I used :
> > > >
> > > > riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key, riakc_map:to_op(Map), []).
> > > > riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []).
> > >
> > > What operation are you performing? What is the update you perform? Do you set a register value, add a register, remove a register?
> > > >
> > > > * It looks like the map is a single level map of last-write-wins registers. Is there a chance that the time on the node handling the update is behind the value in the lww-register?
> > > >
> > > > => I am not sure about logic show conflict of internal riak node. And the issue  never happens if I used single node.
> > > > My bucket properties as follow :
> > > >
> > > > {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1 at 64.137.190.244","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
> > > >
> > > > Note :
> > > > + "datatype":"map"
> > > > + "last_write_wins": false
> > > > + "dvv_enabled": true
> > > > + "allow_mult": true
> > > >
> > > >
> > > > * Have you tried using the `modify_type` operation in riakc_pb_socket which does the fetch/update operation in sequence for you?
> > > >
> > > > => I dot not use yet, but my action is sequence with fetch and then update.  Might be I will try modify_type to see.
> > > >
> > > > * Anything in the error logs on any of the nodes?
> > > >
> > > > => From the node log,  no errror report at fail time.
> > > >
> > > > * Is the opaque context identical from the fetch and then later after the update?
> > > >
> > > > => There is the context  got from fetch and that context used with update.
> > > > And during our debug time with string of sequence : fetch , update, fetch , update , ....  the context I saw always the same at
> > > > fetch data.
> > > >
> > > > Best regards,
> > > > Hue Tran
> > > >
> > > >
> > > >
> > > > On Tue, Feb 7, 2017 at 2:11 AM, John Daily <jdaily at basho.com> wrote:
> > > > Originally I suspected the context which allows Riak to resolve conflicts was not present in your data, but I see it in your map structure. Thanks for supplying such a detailed description.
> > > >
> > > > How fast is your turnaround time between an update and a fetch? Even if the cluster is healthy it’s not impossible to see a timeout between nodes, which could result in a stale retrieval. Have you verified whether the stale data persists?
> > > >
> > > > A single node cluster gives an advantage that you’ll never see in a real cluster: a perfectly synchronized clock. It also reduces (but does not completely eliminate) the possibility of an internal timeout between processes.
> > > >
> > > > -John
> > > >
> > > >> On Feb 6, 2017, at 1:02 PM, my hue <tranmyhue.grackle at gmail.com> wrote:
> > > >>
> > > >> Dear Riak Team,
> > > >>
> > > >> I and my team used riak as database for my production with an cluster including 5 nodes.
> > > >> While production run, we meet an critical bug that is sometimes fail to update document.
> > > >> I and my colleagues performed debug and detected an issue with the scenario as follow:
> > > >>
> > > >> +  fetch document
> > > >> +  change value of document
> > > >> +  update document
> > > >>
> > > >> Repeat about 10 times and will meet fail. With the document is updated continually,
> > > >> sometimes will face update fail.
> > > >>
> > > >> The first time,  5 nodes of cluster we used riak version 2.1.1.
> > > >> After meet above bug, we upgraded to use riak version 2.2.0 and this issue still occurs.
> > > >>
> > > >> Via many time test,  debug using  Tcpdump at riak node :
> > > >>
> > > >> tcpdump -A -ttt  -i {interface} src host {host} and dst port {port}
> > > >>
> > > >> And together with the command:
> > > >>
> > > >> riak-admin status | grep "node_puts_map\| node_puts_map_total\| node_puts_total\| vnode_map_update_total\| vnode_puts_total\"
> > > >>
> > > >> we  got that the riak server already get the update request.
> > > >> However, do not know why riak backend fail to update document.
> > > >> At the fail time,  from riak server log everything is ok.
> > > >>
> > > >> Then we removed cluster and use a single riak server,  and see that above bug never happen.
> > > >>
> > > >> For that reason, think that is only happen with cluster work. We took research on basho riak document and our riak configure
> > > >> seems that like suggestion from document.  We totally blocked on this issue and hope that can get support from you
> > > >> so that can obtain a stable work from riak database for our production.
> > > >> Thank you so much.  Hope that can get your reply soon.
> > > >>
> > > >>
> > > >> * The following is our riak node information :
> > > >>
> > > >> Riak version:  2.2.0
> > > >> OS :  CentOS Linux release 7.2.1511
> > > >> Cpu :  4 core
> > > >> Memory : 4G
> > > >> Riak configure : the attached file "riak.conf"
> > > >>
> > > >> Note :
> > > >>
> > > >> - We mostly using default configure of riak configure except that  we used storage backend is multi
> > > >>
> > > >> storage_backend = multi
> > > >> multi_backend.bitcask_mult.storage_backend = bitcask
> > > >> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
> > > >> multi_backend.default = bitcask_mult
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> - Bucket type created with the following command:
> > > >>
> > > >> riak-admin bucket-type create dev_restor '{"props":{"backend":"bitcask_mult","datatype":"map"}}'
> > > >> riak-admin bucket-type activate dev_restor
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> - Bucket Type Status :
> > > >>
> > > >> >> riak-admin bucket-type status dev_restor
> > > >>
> > > >> dev_restor is active
> > > >> young_vclock: 20
> > > >> w: quorum
> > > >> small_vclock: 50
> > > >> rw: quorum
> > > >> r: quorum
> > > >> pw: 0
> > > >> precommit: []
> > > >> pr: 0
> > > >> postcommit: []
> > > >> old_vclock: 86400
> > > >> notfound_ok: true
> > > >> n_val: 3
> > > >> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
> > > >> last_write_wins: false
> > > >> dw: quorum
> > > >> dvv_enabled: true
> > > >> chash_keyfun: {riak_core_util,chash_std_keyfun}
> > > >> big_vclock: 50
> > > >> basic_quorum: false
> > > >> backend: <<"bitcask_mult">>
> > > >> allow_mult: true
> > > >> datatype: map
> > > >> active: true
> > > >> claimant: 'riak-node1 at 64.137.190.244'
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> - Bucket Property :
> > > >>
> > > >> {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1 at 64.137.190.244","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
> > > >>
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> - Member status :
> > > >>
> > > >> >> riak-admin member-status
> > > >>
> > > >> ================================= Membership ==================================
> > > >> Status     Ring    Pending    Node
> > > >> -------------------------------------------------------------------------------
> > > >> valid      18.8%      --      'riak-node1 at 64.137.190.244'
> > > >> valid      18.8%      --      'riak-node2 at 64.137.247.82'
> > > >> valid      18.8%      --      'riak-node3 at 64.137.162.64'
> > > >> valid      25.0%      --      'riak-node4 at 64.137.161.229'
> > > >> valid      18.8%      --      'riak-node5 at 64.137.217.73'
> > > >> -------------------------------------------------------------------------------
> > > >> Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
> > > >>
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> - Ring
> > > >>
> > > >> >> riak-admin status | grep ring
> > > >>
> > > >> ring_creation_size : 64
> > > >> ring_members : ['riak-node1 at 64.137.190.244','riak-node2 at 64.137.247.82', 'riak-node3 at 64.137.162.64','riak-node4 at 64.137.161.229', 'riak-node5 at 64.137.217.73']
> > > >> ring_num_partitions : 64
> > > >> ring_ownership : <<"[{'riak-node2 at 64.137.247.82',12},\n {'riak-node5 at 64.137.217.73',12},\n {'riak-node1 at 64.137.190.244',12},\n {'riak-node3 at 64.137.162.64',12},\n {'riak-node4 at 64.137.161.229',16}]">>
> > > >> rings_reconciled : 0
> > > >> rings_reconciled_total : 31
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> * The riak client :
> > > >>
> > > >> + riak-erlang-client:  https://github.com/basho/riak-erlang-client
> > > >> + release :   2.4.2
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> * Riak client API used:
> > > >>
> > > >> + Insert/Update:
> > > >>
> > > >> riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key, riakc_map:to_op(Map), []).
> > > >>
> > > >> + Fetch :
> > > >>
> > > >> riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []).
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> * Step to perform an  update :
> > > >>
> > > >> - Fetch document
> > > >> - Update document
> > > >>
> > > >> -----------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> *  Data got from fetch_type:
> > > >>
> > > >> {map,  [{{<<"account_id">>,register}, <<"accounta25a424b8484181e8ba1bec25bf7c491">>},
> > > >> {{<<"created_by_id">>,register}, <<"accounta25a424b8484181e8ba1bec25bf7c491">>}, {{<<"created_time_dt">>,register},<<"2017-01-27T03:34:04Z">>}, {{<<"currency">>,register},<<"cad">>}, {{<<"end_time">>,register},<<"dont_use">>}, {{<<"id">>,register},<<"menufe89488afa948875cab6b0b18d579f21">>}, {{<<"maintain_mode_b">>,register},<<"false">>}, {{<<"menu_category_revision_id">>,register}, <<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}, {{<<"name">>,register},<<"fullmenu">>}, {{<<"order_i">>,register},<<"0">>}, {{<<"rest_location_p">>,register}, <<"10.844117421366443,106.63982392275398">>}, {{<<"restaurant_id">>,register}, <<"rest848e042b3a0488640981c8a6dc4a8281">>}, {{<<"restaurant_status_id">>,register},<<"inactive">>}, {{<<"start_time">>,register},<<"dont_use">>}, {{<<"status_id">>,register},<<"hide">>}, {{<<"updated_by_id">>,register}, <<"accounta25a424b8484181e8ba1bec25bf7c491">>}, {{<<"updated_time_dt">>,register},<<"2017-02-06T17:22:39Z">>}],
> > > >>  [],
> > > >>  [], <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,40,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,54,106>>}
> > > >>
> > > >>
> > > >> *  Update with update_type
> > > >>
> > > >> Below is Map data before using riakc_map:to_op(Map) :
> > > >>
> > > >> {map, [] ,
> > > >>  [{{<<"account_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_time_dt">>,register},{register,<<>>,<<"2017-01-27T03:34:04Z">>}},{{<<"currency">>,register},{register,<<>>,<<"cad">>}},{{<<"end_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"id">>,register},{register,<<>>,<<"menufe89488afa948875cab6b0b18d579f21">>}},{{<<"maintain_mode_b">>,register},{register,<<>>,<<"false">>}},{{<<"menu_category_revision_id">>,register},{register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},{{<<"name">>,register},{register,<<>>,<<"fullmenu">>}},{{<<"order_i">>,register},{register,<<>>,<<"0">>}},{{<<"rest_location_p">>,register},{register,<<>>,<<"10.844117421366443,106.63982392275398">>}},{{<<"restaurant_id">>,register},{register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{{<<"restaurant_status_id">>,register},{register,<<>>,<<"inactive">>}},{{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"status_id">>,register},{register,<<>>,<<"show">>}},{{<<"updated_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"updated_time_dt">>,register},{register,<<>>,<<"2017-02-06T17:22:39Z">>}}],
> > > >>  [] ,  <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>
> > > >> }
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> -
> > > >>
> > > >> Best regards,
> > > >> Hue Tran
> > > >> <riak.conf>_______________________________________________
> > > >> riak-users mailing list
> > > >> riak-users at lists.basho.com
> > > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > >
> > > >
> > >
> > >
> >
> >
> 
> 





More information about the riak-users mailing list