Riak Adoption - What can we do better?

Aphyr aphyr at aphyr.com
Sat Apr 21 12:28:57 EDT 2012


On 04/21/2012 09:07 AM, Les Mikesell wrote:
> On Fri, Apr 20, 2012 at 5:00 PM, Kyle Kingsbury<aphyr at aphyr.com>  wrote:
>>>
>> OK, so how about Statebox? We use timestamps to ameliorate the GC problem so
>> long as a given time window. Our hosts are running NTP so it's all cool, ya?
>> Wrong. One of your hosts is not running NTP. Clock desync issues are fucking
>> *ubiquitous*, sadly, and you have to be willing to accept, say, losing all
>> conflicting writes from a client under some clock skew circumstances.
>
> How hard is it for a cluster-aware application to tell that the clocks
> are out of sync?   You probably can't do better than NTP at fixing it,
> but why even continue to run in that state?   If all it takes is a
> good clock for reliability, let's build good clocks.

You are a network packet. It is very dark. You are likely to be eaten by 
a partition.

Joking aside, many applications do try to do broken clock detection, but 
correcting the error automatically depends on application semantics. On 
top of that, it can be impossible to detect in partitioned situations. 
There are also cases where you *want* to be available in cases where 
time sync is impossible; consider, for example, mobile clients which may 
make requests long after the user interaction. A *logical* consistency 
is paramount... but I digress, haha.

It is *OK* to accept the clock issue sometimes, especially with the 
understanding that it provides probabilistic constraints on conflict 
resolution. Being right 80% of the time can be better than 50% of the 
time. But you *have* to be willing to understand and accept the risks; 
it's something most people have no idea exists.

That's what I think Riak adoption is missing; a huge portion of devs 
just don't understand the implications of Riak's HA approach: logical 
clocks, causality bounds, and synchronization boundaries. I'm hoping to 
change some of that with Meangirls, but I'm not convinced it's enough. 
Reid Draper told me a little while ago that he wanted devs to extend 
CRDTs and logical clocks all the way into their APIs and mobile clients, 
and I think he might be right.

--Kyle




More information about the riak-users mailing list