Riak node recovery after crash.

Nico Meyer nico.meyer at adition.com
Fri Apr 1 11:25:18 EDT 2011


Hi Sean,

this still seems to be nearly two orders of magnitude faster than what I
observe. For us 100Mio keys (per Node! obtained with
riak_kv_bitcask_backend:key_counts()) take something like 500-600
seconds. In your example its 1.5 seconds for (assuming N=3) 30Mio*3/5 =
18Mio keys. Thats 12Mio keys/s vs. 0.2Mio keys/s.

Cheers,
Nico

Am Freitag, den 01.04.2011, 10:27 -0400 schrieb Sean Cribbs:
> Nico,
> 
> That's fair, I was probably too rosy-eyed about it.  However, the difference in startup times between Bitcask and Inno is still orders of magnitude for the same keyspace size. Now that I reflect on it, I remember a customer who had 30MM keys in a 5 node cluster, each node would take about 1.5 seconds for the node watcher to report riak_kv was available. Before they switched off of Innostore, it would take upwards of a minute to load and repair tables. YMMV
> 
> Sean Cribbs <sean at basho.com>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
> 
> On Apr 1, 2011, at 10:11 AM, Nico Meyer wrote:
> 
> > Hi Sean,
> > 
> > I have to object here. We have a cluster of 8 Core/64GB nodes with an
> > SSD drive for the bitcask dir. Each node holds on the order of 100Mio
> > keys. The complete bitcask directory is only about 60Gb big, so it fit
> > almost completely. 
> > The time from starting the node until it start handling requests, which
> > means all hint files have been read, is on the order of 10 minutes.
> > During this time the beam process is completely CPU bound, the disk is
> > hardly breaking a sweat.
> > Only 1 core is used at all, since there is only a single erlang process
> > starting all the vnodes sequentially.  Sometimes a second core is also
> > utilized, but that is due to merges on the already started partitions.
> > 
> > Cheers,
> > Nico
> > 
> > Am Freitag, den 01.04.2011, 08:45 -0400 schrieb Sean Cribbs:
> >> Santhosh,
> >> 
> >> 
> >> Bitcask has crash-proof design and so, unlike Inno, it will not read
> >> the entire keyspace and try to correct it at startup time. It will
> >> simply load the existing hint files and then scan the files it doesn't
> >> have hints for to discover the extant keys.  This takes milliseconds
> >> or less per partition; you will hardly notice it.
> >> 
> >> Sean Cribbs <sean at basho.com>
> >> Developer Advocate
> >> Basho Technologies, Inc.
> >> http://basho.com/
> >> 
> >> On Apr 1, 2011, at 2:49 AM, santhosh venkat wrote:
> >> 
> >>> Hi , 
> >>>      I am trying to experiment with the recovery time of a riak
> >>> node using bitcask storage after a crash .
> >>> 
> >>>      I was able to find some information about that in this page
> >>> (which is for Innodb though)
> >>> 
> >>>       http://wiki.basho.com/Recovering-a-failed-node.html  which is
> >>> more about Innodb  .
> >>> 
> >>>     Upon Reading bitcask paper i found it uses hint file to
> >>> constructs in memory mapping , so it should not ideally take more
> >>> than few mins to reconstruct data after crash . Please throw some
> >>> light on this  .
> >>> 
> >>>   I got this thread dump when i tried the steps outlined in the
> >>> above link.
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:50 ===
> >>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/mysql"},[]}}}]
> >>> =INFO REPORT==== 1-Apr-2011::12:06:50 ===
> >>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/riak"},[]}}}]**
> >>> Found 0 name clashes in code paths 
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.141.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.142.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.143.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.144.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.145.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.146.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.147.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_map) host starting (<0.148.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_reduce) host starting (<0.150.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_reduce) host starting (<0.151.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_reduce) host starting (<0.152.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_reduce) host starting (<0.153.0>)
> >>> 
> >>> =INFO REPORT==== 1-Apr-2011::12:06:51 ===
> >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool:
> >>> riak_kv_js_reduce) host starting (<0.154.0>)
> >>> 
> >>> Please help . 
> >>> 
> >>> --
> >>> Santhosh
> >>> 
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> riak-users at lists.basho.com
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >> 
> >> 
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > 
> 






More information about the riak-users mailing list