Riak node recovery after crash.

Sean Cribbs sean at basho.com
Fri Apr 1 10:27:25 EDT 2011


Nico,

That's fair, I was probably too rosy-eyed about it.  However, the difference in startup times between Bitcask and Inno is still orders of magnitude for the same keyspace size. Now that I reflect on it, I remember a customer who had 30MM keys in a 5 node cluster, each node would take about 1.5 seconds for the node watcher to report riak_kv was available. Before they switched off of Innostore, it would take upwards of a minute to load and repair tables. YMMV

Sean Cribbs <sean at basho.com>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Apr 1, 2011, at 10:11 AM, Nico Meyer wrote:

> Hi Sean,
> 
> I have to object here. We have a cluster of 8 Core/64GB nodes with an
> SSD drive for the bitcask dir. Each node holds on the order of 100Mio
> keys. The complete bitcask directory is only about 60Gb big, so it fit
> almost completely. 
> The time from starting the node until it start handling requests, which
> means all hint files have been read, is on the order of 10 minutes.
> During this time the beam process is completely CPU bound, the disk is
> hardly breaking a sweat.
> Only 1 core is used at all, since there is only a single erlang process
> starting all the vnodes sequentially.  Sometimes a second core is also
> utilized, but that is due to merges on the already started partitions.
> 
> Cheers,
> Nico
> 
> Am Freitag, den 01.04.2011, 08:45 -0400 schrieb Sean Cribbs:
>> Santhosh,
>> 
>> 
>> Bitcask has crash-proof design and so, unlike Inno, it will not read
>> the entire keyspace and try to correct it at startup time. It will
>> simply load the existing hint files and then scan the files it doesn't
>> have hints for to discover the extant keys.  This takes milliseconds
>> or less per partition; you will hardly notice it.
>> 
>> Sean Cribbs <sean at basho.com>
>> Developer Advocate
>> Basho Technologies, Inc.
>> http://basho.com/
>> 
>> On Apr 1, 2011, at 2:49 AM, santhosh venkat wrote:
>> 
>>> Hi , 
>>>      I am trying to experiment with the recovery time of a riak
>>> node using bitcask storage after a crash .
>>> 
>>>      I was able to find some information about that in this page
>>> (which is for Innodb though)
>>> 
>>>       http://wiki.basho.com/Recovering-a-failed-node.html  which is
>>> more about Innodb  .
>>> 
>>>     Upon Reading bitcask paper i found it uses hint file to
>>> constructs in memory mapping , so it should not ideally take more
>>> than few mins to reconstruct data after crash . Please throw some
>>> light on this  .
>>> 
>>>   I got this thread dump when i tried the steps outlined in the
>>> above link.
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:50 ===
>>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/mysql"},[]}}}]
>>> =INFO REPORT==== 1-Apr-2011::12:06:50 ===
>>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/riak"},[]}}}]**
>>> Found 0 name clashes in code paths 
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.141.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.142.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.143.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.144.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.145.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.146.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.147.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_map) host starting (<0.148.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_reduce) host starting (<0.150.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_reduce) host starting (<0.151.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_reduce) host starting (<0.152.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_reduce) host starting (<0.153.0>)
>>> 
>>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
>>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
>>> riak_kv_js_reduce) host starting (<0.154.0>)
>>> 
>>> Please help . 
>>> 
>>> --
>>> Santhosh
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 





More information about the riak-users mailing list