Riak Bitcask merging

Nathan Wilken wilken at asu.edu
Tue Aug 23 04:47:45 EDT 2011

How about this?

for i in data/expiring_bitcask/* ; do $ERTS_PATH/to_erl $PIPE_DIR<<EOF
bitcask:merge("$i/",[{dead_bytes_merge_trigger, 0},{dead_bytes_threshold, 0},{small_file_threshold, 16#80000000},{expiry_secs, 86400}]).

I have a backend (part of a multi_backend) used exclusively for storing data with 9-hour TTL (and so my config expires the data after 24 hours).  It's seldom that k/v's are deleted, so merging during my nightly merge window only slightly compacts the data files into nice, huge, zero-dead-byte files that never again will trigger a merge, and are included in merges triggered by new files since they fall under the small_file_threshold (which is huge for this reason).

>From app.config:
                   {<<"expiring_bitcask">>, riak_kv_bitcask_backend, [
                          {expiry_secs, 86400},
                          {max_file_size, 134217728},
                          {dead_bytes_merge_trigger, 10485760},
                          {dead_bytes_threshold, 10485760},
                          {small_file_threshold, 16#80000000},
                          {merge_window, {1, 5}},
                          {data_root, "data/expiring_bitcask"}

I use dead-byte trigger and threshold values of 0 for manual merges in order to force a merge, and 10mb values in the config to avoid continual merging during my overnight window.

Does this make sense?  I'd have thought my app.config would keep the bitcasks small, but it seems they just grow bigger and never get cleared of much expired data.  The manual config I use above immediately shrinks the data files dramatically.

Any suggestions for a cleaner approach?


From: riak-users-bounces at lists.basho.com [riak-users-bounces at lists.basho.com] On Behalf Of Anthony Molinaro [anthonym at alumni.caltech.edu]
Sent: Monday, August 22, 2011 10:18 AM
To: Dan Reverri
Cc: raghwani sohil; riak-users at lists.basho.com
Subject: Re: Riak Bitcask merging

While I didn't ask this time, I'll explain why I think manual
merging as an option would be great.

As far as I know specifying a merge window doesn't guarantee the
merging happens, only that it might if other thresholds are met.

With our cassandra cluster we've ended up scheduling twice weekly
full compactions (the close equivalent to merging I believe), via
cron.  The days and times are specificaly chosen based on traffic
patterns, and can be changed without restarting the servers.

We don't have this convenience with riak.  We can set it up to only
merge during a window, but can't guarantee everything was merged
or even that any merges will occur.  If I wanted to change when I
do the merging, I have to restart the servers to pick up the new
config (at least I think I would).  I have no way to stagger the
merging (have different nodes merge at different times), unless
I have slightly different config on each node.

If there were a riak-admin command called merge/compact/cleanup/expire
or something which triggered a manual merge I know we would use it.

And while I'm pining for additional command line tools, any idea if
transfers will ever work without disrupting the actual transfers?
It's sort of annoying that it's listed as one of the steps for a
rolling upgrade/restart but if you actually use it, it can cause
upgrade or startup to take longer.  Also, it tends to timeout on
a highly trafficked cluster.

Anyway, sorry about hijacking someone else's question, but
figured more information from users is usually welcome?


On Mon, Aug 22, 2011 at 09:14:06AM -0700, Dan Reverri wrote:
> There is no way to manually trigger a Bticask merge. What's the use case for
> needing to manually trigger the merge? Are you concerned about the size of
> the data files? Are you trying to avoid merging at a particular time?
> Not sure if this will help but you can restrict Bitcask merging to a
> specified window of time:
> http://wiki.basho.com/Bitcask-Configuration.html#Merge-Window
> Thanks,
> Dan
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> dan at basho.com
> On Sun, Aug 21, 2011 at 11:36 PM, raghwani sohil <sohil4you at gmail.com>wrote:
> >
> > Hi All,
> >
> > Is there any way to run  bitcask merging process manually ?
> >
> > thanks ,
> > Sohil Raghwani .
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >

> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Anthony Molinaro                           <anthonym at alumni.caltech.edu>

riak-users mailing list
riak-users at lists.basho.com

More information about the riak-users mailing list