[Basho Riak] Fail To Update Document Repeatly With Cluster of 5 Nodes

DeadZen deadzen at deadzen.com
Thu Feb 9 15:21:03 EST 2017


Why are they public?

On Thu, Feb 9, 2017 at 3:11 PM, Alexander Sicular <siculars at gmail.com> wrote:
> Speaking of timings:
>
> ring_members : ['riak-node1 at 64.137.190.244','riak-node2 at 64.137.247.82',
> 'riak-node3 at 64.137.162.64','riak-node4 at 64.137.161.229',
> 'riak-node5 at 64.137.217.73']
>
> Are these nodes in the same local area network?
>
> On Thu, Feb 9, 2017 at 12:49 PM, my hue <tranmyhue.grackle at gmail.com> wrote:
>> Dear Russel,
>>
>> I did the simplest possible with new document and use modify_type to update
>> a single register.
>> I still meet fail update at some times.
>>
>> My steps did as follow :
>>
>> Step 1:   Initial a new document Map
>> Step 2:  Create new map with :  riakc_pb_socket:update_type(Pid,
>> {BucketType, Bucket}, Key,  riakc_map:to_op(Map), []).
>> Step 3:   Fetch to check result :
>> riakc_pb_socket:fetch_type(Pid,{BucketType,Bucket}, Key).
>> Step 4:  Create Fun for input of modify_type which update only one field of
>> map
>>
>> Fun1 = fun(OldMap) -> riakc_map:update({<<"status_id">>, register}, fun(R)
>> -> riakc_register:set(<<"show">>,  R) end, OldMap) end.
>>
>> Fun2 = fun(OldMap) -> riakc_map:update({<<"status_id">>, register}, fun(R)
>> -> riakc_register:set(<<"hide">>,  R) end, OldMap) end.
>>
>> Step 5: Update :
>>
>> riakc_pb_socket:modify_type(Pid, Fun1, {BucketType, Bucket}, Key, []).
>>
>> Fetch to check :
>>
>> riakc_pb_socket:fetch_type(Pid,{BucketType,Bucket}, Key).
>>
>> Step 6:  Update:
>>
>> riakc_pb_socket:modify_type(Pid, Fun2, {BucketType, Bucket}, Key, []).
>>
>> Fetch to check :
>>
>> riakc_pb_socket:fetch_type(Pid,{BucketType,Bucket}, Key).
>>
>>
>> For my debug and test,  I repeated step 5 and step 6 on one document about
>> 20 times.
>> And via many documents, I meet weird behaviour that some documents meet fail
>> update, and some documents never fail update.
>> The first time, I think that cause network, or timeout between nodes and
>> this is only random of fail.  So I deleted documents with command:
>>
>>  riakc_pb_socket:delete(Pid, {BucketType,Bucket}, Key, []).
>>
>> Then retest on each document of first test again. And It is very amazing
>> that the documents meet fail at first test still meet fail at this second
>> test, and the documents passed at first test still pass at this second test.
>> Delete all again, and retest and of course get the same result.
>>
>> After all I make other test case, I used one fail document at all test
>> times, and keep all fields except change key to get different documents for
>> the debug.  And very surprise that I still got some fail and some success,
>> although documents are  the same field and value except key.  Delete and
>> retest and still the same result. Documents succeeded will be always
>> succeed. And document meet fail will be always failed.  I totally do not
>> understand root cause till now. And hope that can get support and help from
>> the developers of riak.   I can tell that my system mostly fail with cluster
>> run when faced this issue.
>>
>> The following is some map documents I used on the test.  And I also attached
>> the extracted log of each node at one of the fail times together with this
>> email. I do not really get meaning of riak log but hope that can help
>> developers of riak get something.
>>
>>
>>
>> * New Document which meet fail with my steps.
>>
>> {map,[],
>>      [{{<<"account_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}},
>>       {{<<"currency">>,register}, {register,<<>>,<<"usd">>}},
>>
>> {{<<"id">>,register},{register,<<>>,<<"menu1234567812345678123456789">>}},
>>       {{<<"maintain_mode_b">>,register}, {register,<<>>,<<"false">>}},
>>       {{<<"menu_category_revision_id">>,register},
>> {register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},
>>       {{<<"name">>,register},{register,<<>>,<<"menutest">>}},
>>       {{<<"order_id">>,register},{register,<<>>,<<"0">>}},
>>       {{<<"rest_location_p">>,register},
>> {register,<<>>,<<"10.844117421366443,106.63982392275398">>}},
>>       {{<<"restaurant_id">>,register},
>> {register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},
>>       {{<<"restaurant_status_id">>,register}, {register,<<>>,<<"active">>}},
>>       {{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},
>>       {{<<"status_id">>,register},{register,<<>>,<<"show">>}},
>>       {{<<"updated_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"updated_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}}],
>>      [],undefined}.
>>
>> Key = <<"menu1234567812345678123456789">>
>>
>> * New Document which always success with my steps:
>>
>> {map,[],
>>      [{{<<"account_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}},
>>       {{<<"currency">>,register},{register,<<>>,<<"usd">>}},
>>
>> {{<<"id">>,register},{register,<<>>,<<"menub497380c19be4fd3a3b51c85d4e9f246">>}},
>>       {{<<"maintain_mode_b">>,register}, {register,<<>>,<<"false">>}},
>>       {{<<"menu_category_revision_id">>,register},
>> {register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},
>>       {{<<"name">>,register},{register,<<>>,<<"menutest">>}},
>>       {{<<"order_id">>,register},{register,<<>>,<<"0">>}},
>>       {{<<"rest_location_p">>,register},
>> {register,<<>>,<<"10.844117421366443,106.63982392275398">>}},
>>       {{<<"restaurant_id">>,register},
>> {register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},
>>       {{<<"restaurant_status_id">>,register}, {register,<<>>,<<"active">>}},
>>       {{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},
>>       {{<<"status_id">>,register},{register,<<>>,<<"show">>}},
>>       {{<<"updated_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"updated_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}}],
>>      [], undefined}.
>>
>>  Key = <<"menub497380c19be4fd3a3b51c85d4e9f246">>
>>
>> * New Document which fail with my steps
>>
>> {map,[],
>>      [{{<<"account_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"created_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}},
>>       {{<<"currency">>,register},{register,<<>>,<<"usd">>}},
>>
>> {{<<"id">>,register},{register,<<>>,<<"menufe89488afa948875cab6b0b18d579f22">>}},
>>       {{<<"maintain_mode_b">>,register},  {register,<<>>,<<"false">>}},
>>       {{<<"menu_category_revision_id">>,register},
>> {register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},
>>       {{<<"name">>,register},{register,<<>>,<<"menutest">>}},
>>       {{<<"order_id">>,register},{register,<<>>,<<"0">>}},
>>       {{<<"rest_location_p">>,register},
>> {register,<<>>,<<"10.844117421366443,106.63982392275398">>}},
>>       {{<<"restaurant_id">>,register},
>> {register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},
>>       {{<<"restaurant_status_id">>,register}, {register,<<>>,<<"active">>}},
>>       {{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},
>>       {{<<"status_id">>,register},{register,<<>>,<<"show">>}},
>>       {{<<"updated_by_id">>,register},
>> {register,<<>>,<<"accountqweraccountqweraccountqwer">>}},
>>       {{<<"updated_time_dt">>,register},
>> {register,<<>>,<<"2017-02-7T23:49:04Z">>}}],
>>      [],undefined}.
>>
>> Key = <<"menufe89488afa948875cab6b0b18d579f22">>.
>>
>> Note : All documents mostly the same except key, and  tested with the same
>> bucket type and bucket.   Bucket Type and Bucket have properties with which
>> I reported on first email. So for remind, under is a description  of bucket
>> type, bucket and cluster :
>>
>> * Bucket Type :
>>
>> - Bucket type created with the following command:
>>
>> riak-admin bucket-type create bucket_type_name
>> '{"props":{"backend":"bitcask_mult","datatype":"map"}}'
>>
>> riak-admin bucket-type activate bucket_type_name
>>
>>
>> * Bucket Property:
>>
>> {"props":{"name":"bucket_name","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1 at 64.137.190.244","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"bucket_name","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
>>
>> Note :
>> + "datatype":"map"
>> + "last_write_wins": false
>> + "dvv_enabled": true
>> + "allow_mult": true
>>
>>
>> * Cluster Infor :
>>
>> - Member status :
>>
>>>> riak-admin member-status
>>
>> ================================= Membership
>> ==================================
>> Status     Ring    Pending    Node
>> -------------------------------------------------------------------------------
>> valid      18.8%      --      'riak-node1 at 64.137.190.244'
>> valid      18.8%      --      'riak-node2 at 64.137.247.82'
>> valid      18.8%      --      'riak-node3 at 64.137.162.64'
>> valid      25.0%      --      'riak-node4 at 64.137.161.229'
>> valid      18.8%      --      'riak-node5 at 64.137.217.73'
>> -------------------------------------------------------------------------------
>> Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>>
>> - Ring
>>
>>>> riak-admin status | grep ring
>>
>> ring_creation_size : 64
>> ring_members : ['riak-node1 at 64.137.190.244','riak-node2 at 64.137.247.82',
>> 'riak-node3 at 64.137.162.64','riak-node4 at 64.137.161.229',
>> 'riak-node5 at 64.137.217.73']
>> ring_num_partitions : 64
>> ring_ownership : <<"[{'riak-node2 at 64.137.247.82',12},\n
>> {'riak-node5 at 64.137.217.73',12},\n {'riak-node1 at 64.137.190.244',12},\n
>> {'riak-node3 at 64.137.162.64',12},\n {'riak-node4 at 64.137.161.229',16}]">>
>> rings_reconciled : 0
>> rings_reconciled_total : 31
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Feb 7, 2017 at 5:37 PM, Russell Brown <russell.brown at mac.com> wrote:
>>>
>>>
>>> On 7 Feb 2017, at 10:27, my hue <tranmyhue.grackle at gmail.com> wrote:
>>>
>>> > Dear Russell,
>>> >
>>> > Yes, I updated all registers in one go.
>>> > And I do not try yet with updating a single register at a time.
>>> > let me try to see.  But I wonder that any affect on solving conflict at
>>> > riak cluster
>>> > if update all in one go?
>>> >
>>>
>>> Just trying to make the search space as small as possible. I don’t think
>>> _any_ of this should fail. The maps code is very well tested and well used,
>>> so it’s all kind of odd.
>>>
>>> Without hands on it’s hard to debug, and email back and forth is slow, so
>>> if you try the simplest possible thing and that still fails, it helps.
>>>
>>> IMO the simplest possible thing is to start with a new, empty key and use
>>> modify_type to update a single register.
>>>
>>> Many thanks
>>>
>>> Russell
>>>
>>> >
>>> >
>>> > On Tue, Feb 7, 2017 at 5:18 PM, Russell Brown <russell.brown at mac.com>
>>> > wrote:
>>> > So in you’re updating all those registers in one go? Out of interest,
>>> > what happens if you update a single register at a time?
>>> >
>>> > On 7 Feb 2017, at 10:02, my hue <tranmyhue.grackle at gmail.com> wrote:
>>> >
>>> > > Dear Russel,
>>> > >
>>> > > > Can you run riakc_map:to_op(Map). and show me the output of that,
>>> > > > please?
>>> > >
>>> > > The following is output of riakc_map:to_op(Map) :
>>> > >
>>> > > {map, {update, [{update,
>>> > > {<<"updated_time_dt">>,register},{assign,<<"2017-02-06T17:22:39Z">>}},
>>> > > {update,{<<"updated_by_id">>,register},
>>> > > {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{update,{<<"status_id">>,register},{assign,<<"show">>}},{update,{<<"start_time">>,register},{assign,<<"dont_use">>}},{update,{<<"restaurant_status_id">>,register},
>>> > > {assign,<<"inactive">>}}, {update,{<<"restaurant_id">>,register},
>>> > > {assign,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{update,{<<"rest_location_p">>,register},
>>> > > {assign,<<"10.844117421366443,106.63982392275398">>}},
>>> > > {update,{<<"order_i">>,register},{assign,<<"0">>}},
>>> > > {update,{<<"name">>,register},{assign,<<"fullmenu">>}},
>>> > > {update,{<<"menu_category_revision_id">>,register},
>>> > > {assign,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},
>>> > > {update,{<<"maintain_mode_b">>,register},{assign,<<"false">>}},
>>> > > {update,{<<"id">>,register},
>>> > > {assign,<<"menufe89488afa948875cab6b0b18d579f21">>}},
>>> > > {update,{<<"end_time">>,register},{assign,<<"dont_use">>}},{update,{<<"currency">>,register},{assign,<<"cad">>}},
>>> > > {update,{<<"created_time_dt">>,register},
>>> > > {assign,<<"2017-01-27T03:34:04Z">>}},
>>> > > {update,{<<"created_by_id">>,register},
>>> > > {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},
>>> > > {update,{<<"account_id">>,register},
>>> > > {assign,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}}]},
>>> > > <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>}
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > On Tue, Feb 7, 2017 at 4:36 PM, Russell Brown <russell.brown at mac.com>
>>> > > wrote:
>>> > >
>>> > > On 7 Feb 2017, at 09:34, my hue <tranmyhue.grackle at gmail.com> wrote:
>>> > >
>>> > > > Dear Russell,
>>> > > >
>>> > > > >What operation are you performing? What is the update you perform?
>>> > > > > Do you set a register value, add a register, remove a register?
>>> > > >
>>> > > > I used riakc_map:update to update value with map. I do the following
>>> > > > steps :
>>> > > >
>>> > > > - Get FetchData map with  fetch_type
>>> > > > - Extract key, value, context from FetchData
>>> > > > - Obtain UpdateData with:
>>> > > >
>>> > > > + Init map with context
>>> > >
>>> > > I don’t understand this step
>>> > >
>>> > > > + Use :
>>> > > >
>>> > > >    riakc_map:update({K, register}, fun(R) -> riakc_register:set(V,
>>> > > > R) end,  InitMap)
>>> > > >
>>> > > > to obtain UpdateData
>>> > > >
>>> > > > Note:
>>> > > > K : key
>>> > > > V:  value
>>> > > >
>>> > > > - Then  update UpdateData with update_type
>>> > > >
>>> > >
>>> > > Can you run riakc_map:to_op(Map). and show me the output of that,
>>> > > please?
>>> > >
>>> > > > The following is sample about Update data :
>>> > > >
>>> > > > {map, [] ,
>>> > > >
>>> > > > [{{<<"account_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_time_dt">>,register},{register,<<>>,<<"2017-01-27T03:34:04Z">>}},{{<<"currency">>,register},{register,<<>>,<<"cad">>}},{{<<"end_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"id">>,register},{register,<<>>,<<"menufe89488afa948875cab6b0b18d579f21">>}},{{<<"maintain_mode_b">>,register},{register,<<>>,<<"false">>}},{{<<"menu_category_revision_id">>,register},{register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},{{<<"name">>,register},{register,<<>>,<<"fullmenu">>}},{{<<"order_i">>,register},{register,<<>>,<<"0">>}},{{<<"rest_location_p">>,register},{register,<<>>,<<"10.844117421366443,106.63982392275398">>}},{{<<"restaurant_id">>,register},{register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{{<<"restaurant_status_id">>,register},{register,<<>>,<<"inactive">>}},{{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"status_id">>,register},{register,<<>>,<<"show">>}},{{<<"updated_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"updated_time_dt">>,register},{register,<<>>,<<"2017-02-06T17:22:39Z">>}}],
>>> > > >  [] ,
>>> > > > <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>
>>> > > > }
>>> > > >
>>> > > >
>>> > > > On Tue, Feb 7, 2017 at 3:43 PM, Russell Brown
>>> > > > <russell.brown at mac.com> wrote:
>>> > > >
>>> > > > On 7 Feb 2017, at 08:17, my hue <tranmyhue.grackle at gmail.com> wrote:
>>> > > >
>>> > > > > Dear John and Russell Brown,
>>> > > > >
>>> > > > > * How fast is your turnaround time between an update and a fetch?
>>> > > > >
>>> > > > > The turnaround time between an update and a fetch about 1 second.
>>> > > > > During my team and I  debug, we adjusted haproxy with the scenario
>>> > > > > as follow:
>>> > > > >
>>> > > > > Scenario 1 : round robin via 5 nodes of cluster
>>> > > > >
>>> > > > > We meet issue at scenario 1 and we are afraid of that timeout can
>>> > > > > be occurs between nodes,
>>> > > > > make us still get stale data. Then we performed scenario 2
>>> > > > >
>>> > > > > Scenario 2:  Disable round robin and only route request to node 1.
>>> > > > > Cluster still is 5 nodes.
>>> > > > > With this case we ensure that request update and fetch always come
>>> > > > > to and from node 1.
>>> > > > > And the issue still occurs.
>>> > > > >
>>> > > > > At the fail time, I hoped that can get any error log from riak
>>> > > > > nodes to give me any information.
>>> > > > > But riak log show to me nothing and everything is ok.
>>> > > > >
>>> > > > > * What operation are you performing?
>>> > > > >
>>> > > > > I used :
>>> > > > >
>>> > > > > riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key,
>>> > > > > riakc_map:to_op(Map), []).
>>> > > > > riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []).
>>> > > >
>>> > > > What operation are you performing? What is the update you perform?
>>> > > > Do you set a register value, add a register, remove a register?
>>> > > > >
>>> > > > > * It looks like the map is a single level map of last-write-wins
>>> > > > > registers. Is there a chance that the time on the node handling the update
>>> > > > > is behind the value in the lww-register?
>>> > > > >
>>> > > > > => I am not sure about logic show conflict of internal riak node.
>>> > > > > And the issue  never happens if I used single node.
>>> > > > > My bucket properties as follow :
>>> > > > >
>>> > > > >
>>> > > > > {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1 at 64.137.190.244","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
>>> > > > >
>>> > > > > Note :
>>> > > > > + "datatype":"map"
>>> > > > > + "last_write_wins": false
>>> > > > > + "dvv_enabled": true
>>> > > > > + "allow_mult": true
>>> > > > >
>>> > > > >
>>> > > > > * Have you tried using the `modify_type` operation in
>>> > > > > riakc_pb_socket which does the fetch/update operation in sequence for you?
>>> > > > >
>>> > > > > => I dot not use yet, but my action is sequence with fetch and
>>> > > > > then update.  Might be I will try modify_type to see.
>>> > > > >
>>> > > > > * Anything in the error logs on any of the nodes?
>>> > > > >
>>> > > > > => From the node log,  no errror report at fail time.
>>> > > > >
>>> > > > > * Is the opaque context identical from the fetch and then later
>>> > > > > after the update?
>>> > > > >
>>> > > > > => There is the context  got from fetch and that context used with
>>> > > > > update.
>>> > > > > And during our debug time with string of sequence : fetch ,
>>> > > > > update, fetch , update , ....  the context I saw always the same at
>>> > > > > fetch data.
>>> > > > >
>>> > > > > Best regards,
>>> > > > > Hue Tran
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > On Tue, Feb 7, 2017 at 2:11 AM, John Daily <jdaily at basho.com>
>>> > > > > wrote:
>>> > > > > Originally I suspected the context which allows Riak to resolve
>>> > > > > conflicts was not present in your data, but I see it in your map structure.
>>> > > > > Thanks for supplying such a detailed description.
>>> > > > >
>>> > > > > How fast is your turnaround time between an update and a fetch?
>>> > > > > Even if the cluster is healthy it’s not impossible to see a timeout between
>>> > > > > nodes, which could result in a stale retrieval. Have you verified whether
>>> > > > > the stale data persists?
>>> > > > >
>>> > > > > A single node cluster gives an advantage that you’ll never see in
>>> > > > > a real cluster: a perfectly synchronized clock. It also reduces (but does
>>> > > > > not completely eliminate) the possibility of an internal timeout between
>>> > > > > processes.
>>> > > > >
>>> > > > > -John
>>> > > > >
>>> > > > >> On Feb 6, 2017, at 1:02 PM, my hue <tranmyhue.grackle at gmail.com>
>>> > > > >> wrote:
>>> > > > >>
>>> > > > >> Dear Riak Team,
>>> > > > >>
>>> > > > >> I and my team used riak as database for my production with an
>>> > > > >> cluster including 5 nodes.
>>> > > > >> While production run, we meet an critical bug that is sometimes
>>> > > > >> fail to update document.
>>> > > > >> I and my colleagues performed debug and detected an issue with
>>> > > > >> the scenario as follow:
>>> > > > >>
>>> > > > >> +  fetch document
>>> > > > >> +  change value of document
>>> > > > >> +  update document
>>> > > > >>
>>> > > > >> Repeat about 10 times and will meet fail. With the document is
>>> > > > >> updated continually,
>>> > > > >> sometimes will face update fail.
>>> > > > >>
>>> > > > >> The first time,  5 nodes of cluster we used riak version 2.1.1.
>>> > > > >> After meet above bug, we upgraded to use riak version 2.2.0 and
>>> > > > >> this issue still occurs.
>>> > > > >>
>>> > > > >> Via many time test,  debug using  Tcpdump at riak node :
>>> > > > >>
>>> > > > >> tcpdump -A -ttt  -i {interface} src host {host} and dst port
>>> > > > >> {port}
>>> > > > >>
>>> > > > >> And together with the command:
>>> > > > >>
>>> > > > >> riak-admin status | grep "node_puts_map\| node_puts_map_total\|
>>> > > > >> node_puts_total\| vnode_map_update_total\| vnode_puts_total\"
>>> > > > >>
>>> > > > >> we  got that the riak server already get the update request.
>>> > > > >> However, do not know why riak backend fail to update document.
>>> > > > >> At the fail time,  from riak server log everything is ok.
>>> > > > >>
>>> > > > >> Then we removed cluster and use a single riak server,  and see
>>> > > > >> that above bug never happen.
>>> > > > >>
>>> > > > >> For that reason, think that is only happen with cluster work. We
>>> > > > >> took research on basho riak document and our riak configure
>>> > > > >> seems that like suggestion from document.  We totally blocked on
>>> > > > >> this issue and hope that can get support from you
>>> > > > >> so that can obtain a stable work from riak database for our
>>> > > > >> production.
>>> > > > >> Thank you so much.  Hope that can get your reply soon.
>>> > > > >>
>>> > > > >>
>>> > > > >> * The following is our riak node information :
>>> > > > >>
>>> > > > >> Riak version:  2.2.0
>>> > > > >> OS :  CentOS Linux release 7.2.1511
>>> > > > >> Cpu :  4 core
>>> > > > >> Memory : 4G
>>> > > > >> Riak configure : the attached file "riak.conf"
>>> > > > >>
>>> > > > >> Note :
>>> > > > >>
>>> > > > >> - We mostly using default configure of riak configure except that
>>> > > > >> we used storage backend is multi
>>> > > > >>
>>> > > > >> storage_backend = multi
>>> > > > >> multi_backend.bitcask_mult.storage_backend = bitcask
>>> > > > >> multi_backend.bitcask_mult.bitcask.data_root =
>>> > > > >> /var/lib/riak/bitcask_mult
>>> > > > >> multi_backend.default = bitcask_mult
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> - Bucket type created with the following command:
>>> > > > >>
>>> > > > >> riak-admin bucket-type create dev_restor
>>> > > > >> '{"props":{"backend":"bitcask_mult","datatype":"map"}}'
>>> > > > >> riak-admin bucket-type activate dev_restor
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> - Bucket Type Status :
>>> > > > >>
>>> > > > >> >> riak-admin bucket-type status dev_restor
>>> > > > >>
>>> > > > >> dev_restor is active
>>> > > > >> young_vclock: 20
>>> > > > >> w: quorum
>>> > > > >> small_vclock: 50
>>> > > > >> rw: quorum
>>> > > > >> r: quorum
>>> > > > >> pw: 0
>>> > > > >> precommit: []
>>> > > > >> pr: 0
>>> > > > >> postcommit: []
>>> > > > >> old_vclock: 86400
>>> > > > >> notfound_ok: true
>>> > > > >> n_val: 3
>>> > > > >> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
>>> > > > >> last_write_wins: false
>>> > > > >> dw: quorum
>>> > > > >> dvv_enabled: true
>>> > > > >> chash_keyfun: {riak_core_util,chash_std_keyfun}
>>> > > > >> big_vclock: 50
>>> > > > >> basic_quorum: false
>>> > > > >> backend: <<"bitcask_mult">>
>>> > > > >> allow_mult: true
>>> > > > >> datatype: map
>>> > > > >> active: true
>>> > > > >> claimant: 'riak-node1 at 64.137.190.244'
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> - Bucket Property :
>>> > > > >>
>>> > > > >>
>>> > > > >> {"props":{"name":"menu","active":true,"allow_mult":true,"backend":"bitcask_mult","basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"claimant":"riak-node1 at 64.137.190.244","datatype":"map","dvv_enabled":true,"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":3,"name":"menu","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[],"pw":0,"r":"quorum","rw":"quorum","search_index":"menu_idx","small_vclock":50,"w":"quorum","young_vclock":20}}
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> - Member status :
>>> > > > >>
>>> > > > >> >> riak-admin member-status
>>> > > > >>
>>> > > > >> ================================= Membership
>>> > > > >> ==================================
>>> > > > >> Status     Ring    Pending    Node
>>> > > > >>
>>> > > > >> -------------------------------------------------------------------------------
>>> > > > >> valid      18.8%      --      'riak-node1 at 64.137.190.244'
>>> > > > >> valid      18.8%      --      'riak-node2 at 64.137.247.82'
>>> > > > >> valid      18.8%      --      'riak-node3 at 64.137.162.64'
>>> > > > >> valid      25.0%      --      'riak-node4 at 64.137.161.229'
>>> > > > >> valid      18.8%      --      'riak-node5 at 64.137.217.73'
>>> > > > >>
>>> > > > >> -------------------------------------------------------------------------------
>>> > > > >> Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> - Ring
>>> > > > >>
>>> > > > >> >> riak-admin status | grep ring
>>> > > > >>
>>> > > > >> ring_creation_size : 64
>>> > > > >> ring_members :
>>> > > > >> ['riak-node1 at 64.137.190.244','riak-node2 at 64.137.247.82',
>>> > > > >> 'riak-node3 at 64.137.162.64','riak-node4 at 64.137.161.229',
>>> > > > >> 'riak-node5 at 64.137.217.73']
>>> > > > >> ring_num_partitions : 64
>>> > > > >> ring_ownership : <<"[{'riak-node2 at 64.137.247.82',12},\n
>>> > > > >> {'riak-node5 at 64.137.217.73',12},\n {'riak-node1 at 64.137.190.244',12},\n
>>> > > > >> {'riak-node3 at 64.137.162.64',12},\n {'riak-node4 at 64.137.161.229',16}]">>
>>> > > > >> rings_reconciled : 0
>>> > > > >> rings_reconciled_total : 31
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> * The riak client :
>>> > > > >>
>>> > > > >> + riak-erlang-client:
>>> > > > >> https://github.com/basho/riak-erlang-client
>>> > > > >> + release :   2.4.2
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> * Riak client API used:
>>> > > > >>
>>> > > > >> + Insert/Update:
>>> > > > >>
>>> > > > >> riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key,
>>> > > > >> riakc_map:to_op(Map), []).
>>> > > > >>
>>> > > > >> + Fetch :
>>> > > > >>
>>> > > > >> riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []).
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> * Step to perform an  update :
>>> > > > >>
>>> > > > >> - Fetch document
>>> > > > >> - Update document
>>> > > > >>
>>> > > > >>
>>> > > > >> -----------------------------------------------------------------------------------------------------------------------------
>>> > > > >>
>>> > > > >> *  Data got from fetch_type:
>>> > > > >>
>>> > > > >> {map,  [{{<<"account_id">>,register},
>>> > > > >> <<"accounta25a424b8484181e8ba1bec25bf7c491">>},
>>> > > > >> {{<<"created_by_id">>,register},
>>> > > > >> <<"accounta25a424b8484181e8ba1bec25bf7c491">>},
>>> > > > >> {{<<"created_time_dt">>,register},<<"2017-01-27T03:34:04Z">>},
>>> > > > >> {{<<"currency">>,register},<<"cad">>},
>>> > > > >> {{<<"end_time">>,register},<<"dont_use">>},
>>> > > > >> {{<<"id">>,register},<<"menufe89488afa948875cab6b0b18d579f21">>},
>>> > > > >> {{<<"maintain_mode_b">>,register},<<"false">>},
>>> > > > >> {{<<"menu_category_revision_id">>,register},
>>> > > > >> <<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>},
>>> > > > >> {{<<"name">>,register},<<"fullmenu">>}, {{<<"order_i">>,register},<<"0">>},
>>> > > > >> {{<<"rest_location_p">>,register},
>>> > > > >> <<"10.844117421366443,106.63982392275398">>},
>>> > > > >> {{<<"restaurant_id">>,register},
>>> > > > >> <<"rest848e042b3a0488640981c8a6dc4a8281">>},
>>> > > > >> {{<<"restaurant_status_id">>,register},<<"inactive">>},
>>> > > > >> {{<<"start_time">>,register},<<"dont_use">>},
>>> > > > >> {{<<"status_id">>,register},<<"hide">>}, {{<<"updated_by_id">>,register},
>>> > > > >> <<"accounta25a424b8484181e8ba1bec25bf7c491">>},
>>> > > > >> {{<<"updated_time_dt">>,register},<<"2017-02-06T17:22:39Z">>}],
>>> > > > >>  [],
>>> > > > >>  [],
>>> > > > >> <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,40,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,54,106>>}
>>> > > > >>
>>> > > > >>
>>> > > > >> *  Update with update_type
>>> > > > >>
>>> > > > >> Below is Map data before using riakc_map:to_op(Map) :
>>> > > > >>
>>> > > > >> {map, [] ,
>>> > > > >>
>>> > > > >> [{{<<"account_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_time_dt">>,register},{register,<<>>,<<"2017-01-27T03:34:04Z">>}},{{<<"currency">>,register},{register,<<>>,<<"cad">>}},{{<<"end_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"id">>,register},{register,<<>>,<<"menufe89488afa948875cab6b0b18d579f21">>}},{{<<"maintain_mode_b">>,register},{register,<<>>,<<"false">>}},{{<<"menu_category_revision_id">>,register},{register,<<>>,<<"0-634736bc14e0bd3ed7e3fe0f1ee64443">>}},{{<<"name">>,register},{register,<<>>,<<"fullmenu">>}},{{<<"order_i">>,register},{register,<<>>,<<"0">>}},{{<<"rest_location_p">>,register},{register,<<>>,<<"10.844117421366443,106.63982392275398">>}},{{<<"restaurant_id">>,register},{register,<<>>,<<"rest848e042b3a0488640981c8a6dc4a8281">>}},{{<<"restaurant_status_id">>,register},{register,<<>>,<<"inactive">>}},{{<<"start_time">>,register},{register,<<>>,<<"dont_use">>}},{{<<"status_id">>,register},{register,<<>>,<<"show">>}},{{<<"updated_by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"updated_time_dt">>,register},{register,<<>>,<<"2017-02-06T17:22:39Z">>}}],
>>> > > > >>  [] ,
>>> > > > >> <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137,252,139,186,176,202,25,96,0,0,195,164,97,53,106>>
>>> > > > >> }
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> -
>>> > > > >>
>>> > > > >> Best regards,
>>> > > > >> Hue Tran
>>> > > > >> <riak.conf>_______________________________________________
>>> > > > >> riak-users mailing list
>>> > > > >> riak-users at lists.basho.com
>>> > > > >>
>>> > > > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




More information about the riak-users mailing list