Urgent help with a down node.

John Caprice jcaprice at basho.com
Mon Jul 8 11:22:58 EDT 2013


Hey Bryan,

This indicates a problem with the Bitcask data file.  That data file,
according to the second error report, was truncated.  You more than likely
did not experience any data loss as this would affect only a single
replica, and only those contained in that data file.  To be safe, you can
repair the partition by attaching to Riak and running:

riak_kv:repair(22835963083295358096932575511191922182123945984).

after which you can detach from Riak with ctrl-d and monitor the status of
the repair in riak-admin transfers.  This command will read-repair any lost
replicas due to the data file truncation.

Thanks,

John Caprice


On Mon, Jul 8, 2013 at 8:11 AM, Bryan Hughes <bryan at go-factory.net> wrote:

>  Andrew,
>
> Thanks for the tip on how to use Google.  :)   But that was not my
> original question.  I wanted to understand in more detail from the Basho
> folks what
>
> 2013-07-07 12:51:42 =ERROR REPORT====
> Hintfile
> './data/bitcask/22835963083295358096932575511191922182123945984/3.bitcask.hint'
> contains pointer 16555635 566 that is greater than total data size 16556032
>
> and
>
>
> 2013-07-07 12:54:43 =ERROR REPORT====
> Bad datafile entry, discarding(383/566 bytes)
>
> meant to my system.  For example, did I lose data and if so, how do I know
> what data was lost?  More importantly is if this is data lost, how did it
> happen.  I ran fsck on all the disks and checked the health of the system -
> which is all good.
>
> The later information was included for completeness.
>
> Bryan
>
>
> On 7/8/13 12:49 AM, Andrew Berman wrote:
>
> Bryan,
>
>  What version of Erlang?  You should check this out:
> https://github.com/basho/riak_kv/issues/411
>
>  BTW - Google is your friend, which is how I found the above issue :)
>
>  --Andrew
>
>
> On Sun, Jul 7, 2013 at 3:01 PM, Bryan Hughes <bryan at go-factory.net> wrote:
>
>>  Hi Mark,
>>
>> DOH - sorry for the lack of detail.  Didnt have enough coffee this
>> morning.
>>
>> OS:     CentOS release 6.3 (Final)
>> Riak:   Riak 1.2.1
>>
>> Hadnt had a chance to upgrade to 1.3 yet.
>>
>> Got the node back up - but not entirely sure why which is a little
>> concerning.  Been verifying the data, and everything looks intact.  When I
>> try to run riak-admin status, I get the following (note I am not entirely
>> sure this was the case when we first set the node up):
>>
>> $ riak-admin status
>> Status failed, see log for details
>>
>> The logs shows:
>>
>> 2013-07-07 14:55:03.858 [error] <0.12982.0>@riak_kv_console:status:173
>> Status failed error:function_clause
>> 2013-07-07 14:55:03.858 [error] emulator Error in process <0.12983.0> on
>> node 'riak at 127.0.0.1' with exit value:
>> {badarg,[{erlang,system_info,[global_heaps_size],[]},{riak_kv_stat,system_stats,0,[{file,"src/riak_kv_stat.erl"},{line,421}]},{riak_kv_stat,produce_stats,0,[{file,"src/riak_kv_stat.erl"},{line,320}]},{timer,tc,3,[{file,"timer...
>>
>>
>> This is on a dev cluster with an out-of-the box configuration using
>> bitcask.
>>
>> Thanks!
>>
>> Bryan
>>
>>
>> On 7/7/13 2:51 PM, Mark Phillips wrote:
>>
>> Hi Bryan,
>>
>>  I remember seeing something similar on the list a while ago. I'll dig
>> through the archives (Riak.markmail.org) if I have a few minutes later
>> tonight.
>>
>>  In the mean time, what version of Riak is this? And what OS?
>>
>>  Mark
>>
>> On Sunday, July 7, 2013, Bryan Hughes wrote:
>>
>>>  Anyone familiar with this error message?
>>>
>>> 2013-07-07 12:51:42 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/22835963083295358096932575511191922182123945984/3.bitcask.hint'
>>> contains pointer 16555635 566 that is greater than total data size 16556032
>>> 2013-07-07 12:51:45 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/114179815416476790484662877555959610910619729920/3.bitcask.hint'
>>> contains pointer 17817310 567 <17817310%20567> that is greater than
>>> total data size 17817600
>>> 2013-07-07 12:51:46 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/159851741583067506678528028578343455274867621888/3.bitcask.hint'
>>> contains pointer 7573448 567 <7573448%20567> that is greater than total
>>> data size 7573504
>>> 2013-07-07 12:51:46 =ERROR REPORT====
>>> Bad datafile entry 1:
>>> {ok,<<131,104,2,109,0,0,0,9,65,80,73,67,79,85,78,84,83,109,0,0,0,33,55,56,54,57,52,49,56,49,94,103,111,115,101,114,118,105,99,101,95,99>>}
>>> 2013-07-07 12:51:56 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/730750818665451459101842416358141509827966271488/3.bitcask.hint'
>>> contains pointer 13229833 581 that is greater than total data size 13230080
>>> 2013-07-07 12:52:05 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/1187470080331358621040493926581979953470445191168/3.bitcask.hint'
>>> contains pointer 23465420 578 that is greater than total data size 23465984
>>> 2013-07-07 12:52:06 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/1210306043414653979137426502093171875652569137152/3.bitcask.hint'
>>> contains pointer 27733824 578 that is greater than total data size 27734016
>>> 2013-07-07 12:52:07 =ERROR REPORT====
>>> Hintfile
>>> './data/bitcask/1233142006497949337234359077604363797834693083136/3.bitcask.hint'
>>> contains pointer 15014008 578 <15014008%20578> that is greater than
>>> total data size 15014586
>>> 2013-07-07 12:54:43 =ERROR REPORT====
>>> Bad datafile entry, discarding(383/566 bytes)
>>> 2013-07-07 12:54:45 =ERROR REPORT====
>>> Bad datafile entry, discarding(276/567 bytes)
>>> 2013-07-07 12:54:46 =ERROR REPORT====
>>> Bad datafile entry, discarding(42/567 bytes)
>>> 2013-07-07 12:54:57 =ERROR REPORT====
>>> Bad datafile entry, discarding(233/581 bytes)
>>> 2013-07-07 12:55:06 =ERROR REPORT====
>>> Bad datafile entry, discarding(550/578 bytes)
>>> 2013-07-07 12:55:07 =ERROR REPORT====
>>> Bad datafile entry, discarding(178/578 bytes)
>>> 2013-07-07 12:56:00 =ERROR REPORT====
>>> Error in process <0.1536.0> on node 'riak at 127.0.0.1' with exit value:
>>> {badarg,[{erlang,system_info,[global_heaps_size],[]},{riak_kv_stat,system_stats,0,[{file,"src/riak_kv_stat.erl"},{line,421}]},{riak_kv_stat,produce_stats,0,[{file,"src/riak_kv_stat.erl"},{line,320}]},{timer,tc,3,[{file,"timer...
>>>
>>> --
>>>
>>> Bryan Hughes
>>> *Go Factory*
>>> http://www.go-factory.net
>>>
>>> *"Internet Class, Enterprise Grade"*
>>>
>>>
>>>
>>  --
>>
>> Bryan Hughes
>> CTO and Founder / *Go Factory*
>> (415) 515-7916 <%28415%29%20515-7916>
>>
>> http://www.go-factory.net
>>
>> *"Internet Class, Enterprise Grade"*
>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130708/49f9f336/attachment.html>


More information about the riak-users mailing list