Siblings on first write to a key

Daniel Abrahamsson hamsson at gmail.com
Tue Apr 18 08:28:21 EDT 2017


Hi Magnus,

This cluster has been running in production for a few months. Key
generation is based on flake (https://github.com/boundary/flake); we
have never experienced a collision in the 3+ years we have been using
it heavily in production. However, I will look into that possibility
as well.

I just noticed that one of the Riak nodes logged this at the time:

2017-04-13 17:42:40.567 [error]
<0.3624.28>@riak_api_pb_server:handle_info:331 Unrecognized message
{30320806,{ok,{r_object,<<"session">>,<<".12011742tWzDvu8mk5WAdfYihfV_T3DcnJ5VDyXC0c">>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<"X-Riak-VTag">>,53,114,86,115,108,71,120,112,73,55,108,118,114,100,105,114,107,104,50,66,105,119]],[[<<"index">>]],[],[[<<"X-Riak-Last-Modified">>|{1492,105357,453143}]],[],[]}}},<<...
(actual value removed).

I also have another example (from the same cluster) where there is a
*single* writer to a key, but after a few writes/updates, it also got
a sibling error. Also at that time, the write+read took significantly
longer than normal. I'll check if we had any "unrecognized messages"
in the Riak logs at that time as well.

To answer your second question, we are talking to the riak cluster
over protocol buffers, using the official Erlang client.

//Daniel

On Tue, Apr 18, 2017 at 1:51 PM, Magnus Kessler <mkessler at basho.com> wrote:
> On 18 April 2017 at 08:20, Daniel Abrahamsson <hamsson at gmail.com> wrote:
>>
>> I've run into a case where I got a sbiling error/response on the first
>> ever write to a key. I would like to understand how this could happen.
>> Normally when you get siblings, it is because you have written a value
>> with an out-of-date vclock. But since this is the first write, there
>> is no vclock. Could someone shed some light on this for me?
>>
>> It is worth to mention that the it took 3 seconds for Riak to deliver
>> the response, so it is possible there was some kind of network issue
>> at the time.
>>
>> Here are some details about my setup:
>> Number of nodes: 8.
>> n_val: 5
>> write options: pw: 3 (quorum), return_body
>>
>> Regards,
>> Daniel Abrahamsson
>>
>
>
> Hi Daniel,
>
> Please let me know if all nodes in this cluster were set up completely
> fresh, with empty backend directories, or if any of them had been used
> before for a Riak installation. If the latter is the case, it may be that
> the key in question had already been used once before. Cluster nodes pick up
> data from pre-existing backends.
>
> How do you access the key for read and write operations?
>
> Kind Regards,
>
> Magnus
>
>
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
>
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431




More information about the riak-users mailing list