Merge times.

Misha Gorodnitzky Misha.Gorodnitzky at
Wed Jul 6 19:17:58 EDT 2011

Hello fellow Riakers,

What are the factors that affect how long it takes Riak to do a merge?

In our scenario, doing a merge takes about an hour per node, and that's regardless of wether that's a result of Riak deciding to do a merge on it's own, us forcing a merge (which we do nightly via cron) or the Riak node has been restarted (in which case I believe it also merges the active file). Our feeling is that it's the sheer number of keys, and the size of the associated data, that's taking so much time[1], as opposed to the size of the journal which is being merged.

Sean kindly suggested to us in IRC that reducing the max_file_size setting could help because it will keep our data files smaller, meaning that when we restart there will hopefully be less data in the active file that just got rolled over. But if it's already taking us nearly an hour to merge when we don't restart, it sounds to me like a smaller active file won't help. Is that right?


[1] We know that we have a lot of old cruft hanging around that we can delete, we're in the process of changing our Riak config so that we can have a backend with expiry_secs set available. Typically our bitcask data dir takes up 62GB-72GB per node (~2.5GB per part.), although after restarting and the subsequent merge we're seeing this drop to ~32GB per node (~1.1GB per part.).


Misha Gorodnitzky | Senior Application Developer | Mobile Interactive Group

M: +447760208493   T: +442079215590
A: The Tower Building, 7th Floor, 11 York Road, London, SE1 7NX
W:   Twitter: @migcan

This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Mobile Interactive Group therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version.

The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the author's employer - Mobile Interactive Group. Mobile Interactive Group does not take any responsibility for the views of the author and reserves the right to monitor, review and retain all emails entering its systems, in accordance with local law.

Registered in England and Wales - Company Number 4672067

More information about the riak-users mailing list