Truncated bit-cask files

Magnus Kessler mkessler at basho.com
Tue Feb 14 10:42:18 EST 2017


On 14 February 2017 at 14:46, Arun Rajagopalan <arun.v.rajagopalan at gmail.com
> wrote:

> Hi Magnus
>
> RIAK crashes on startup when I have trucated bitcask file
>
> It also crashes when the AAE files are bad too I think. Example below
>
> 2017-02-13 21:18:30 =CRASH REPORT====
>
>   crasher:
>
>     initial call: riak_kv_index_hashtree:init/1
>
>     pid: <0.6037.0>
>
>     registered_name: []
>
>     exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated
> record at end of file"}}},[{hashtree,new_segment_
>
> store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,
> new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
>
> ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.
> erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
>
> 8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_
> kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
>
> init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,
> 268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
>
> ,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,
> 239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
>
> ,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
>
>     ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
>
>     messages: []
>
>     links: []
>
>     dictionary: []
>
>     trap_exit: false
>
>     status: running
>
>     heap_size: 1598
>
>     stack_size: 27
>
>     reductions: 889
>
>   neighbours:
>
>
>
> Regards
> Arun
>
>
Hi Arun,

The crash log you provided shows that there is a corrupted file in the AAE
(anti_entropy) backend. Entries in console.log should have more information
about which partition is affected. Please post output from the affected
node at around 2017-02-13T21:18:30. As this is AAE data, it is safe to
remove the directory named after the affected partition from the
active_entropy directory before restarting the node. You may find that
there is more than one affected partition, the next of which will be
encountered after the attempted restart only. If this is the case, simply
identify the next partition in the same way and remove it, too, until the
node starts up successfully again.

Is there a reason why the nodes aren't shut down in the regular way?

Kind Regards,

Magnus



-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20170214/8f5706a5/attachment-0002.html>


More information about the riak-users mailing list