Lots of bitcask files for a vnode, unable to merge

Nico Meyer nico.meyer at adition.com
Fri Jul 1 06:30:15 EDT 2011


Hi Apyr,

I have had this problem in the past. Most likely the 
1309482913.bitcask.data is corrupted somehow. I mostly see this problem 
after a machine crash or disk problems, and in this case it is alway the 
end of the file that is corrupted or truncated (the last record that 
Riak was trying to write when the crash/problem happened).
In any case, you should provide the whole error message. The most 
interesting part is at the end, which you cut out.

Bitcask doesn't handle corrupted files very well in all cases, which is 
unfortunate. I patched our version of Riak/bitcask, to gracefully handle 
all of the errors I encountered so far by simply ignoring the rest of 
the file after the first invalid record. But I didn't yet come around to 
to commit them back to Basho (OK - its been over 4 months, shame on me). 
My changes are based on bitcask-1.1.5, which is part of Riak-0.14.1, so 
they should be usable. If anyone is interested, my fixes are available here:

https://github.com/nicom/bitcask/tree/adition-1.1.5


Cheers,
Nico

Am 01.07.2011 03:27, schrieb Aphyr:
> One of the vnodes on one of my hosts has a *lot* of bitcask data/hint 
> files, and makes a new one every 3 minutes. In the logs, I get
>
> =ERROR REPORT==== 30-Jun-2011::20:24:14 ===
> Failed to merge 
> ["/var/lib/riak/bitcask/794976964837219653749465284983368790965189869568", 
> [],
> ...HUGE LIST OF DATA FILES...
>
> in bitcask_fileops:fold_loop, bitcask:merge_single_entry, merge_files, 
> merge1, bitcask_merge_worker:do_merge.
>
> Here's the directory:
>
> ...
> -rw-------   1 riak riak         0 2011-06-30 19:55 
> 1309481706.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 19:55 
> 1309481706.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 19:58 
> 1309481886.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 19:58 
> 1309481886.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:01 
> 1309482066.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:01 
> 1309482066.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:04 
> 1309482246.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:04 
> 1309482246.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:07 
> 1309482426.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:07 
> 1309482426.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:10 
> 1309482606.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:10 
> 1309482606.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:13 
> 1309482786.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:13 
> 1309482786.bitcask.hint
> -rw-------   1 riak riak     32948 2011-06-30 20:21 
> 1309482913.bitcask.data
> -rw-r--r--   1 riak riak      1043 2011-06-30 20:21 
> 1309482913.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:18 
> 1309483092.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:18 
> 1309483092.bitcask.hint
> -rw-------   1 riak riak         0 2011-06-30 20:21 
> 1309483272.bitcask.data
> -rw-r--r--   1 riak riak         0 2011-06-30 20:21 
> 1309483272.bitcask.hint
>
> Any ideas as to how it could have gotten into this state, and how to 
> fix it?
>
> --Kyle
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list