Pruning (merging) after storage reaches a certain size?

Steve Webb swebb at gnip.com
Thu Jul 28 12:18:26 EDT 2011


Just to be clear ...

If I made the max_file_size small, and set the expire_secs value to 
something small, but never explicitly delete anything, the non-active 
files will be considered for merging (and will prune any expired data) 
just because they are inactive and don't trigger any other merge-detection 
criteria?

If not, how would I configure riak as a system that I could continuously 
insert data into and always just have the last days worth of data or so?

- Steve

--
Steve Webb - Senior System Administrator for gnip.com
http://twitter.com/GnipWebb

On Mon, 13 Jun 2011, Justin Sheehy wrote:

> Hi, Steve.
>
> The key to your situation was in my earlier email:
>
>    One note that is relevant for your specific use: the expiry_secs
>    parameter will cause a given item to disappear from the client
>    API immediately after expiry, and to be cleaned if it is in a file
>    already being merged, but will not currently contribute toward
>    merge triggers or thresholds on its own if not otherwise "dead".
>
> That is, bitcask wasn't originally designed around the expiry-centric
> way of removing old data, and data that has simply expired (but not
> actively been deleted) will not be counted as garbage toward
> thresholds or triggers at this time.  It will be cleaned up in a
> merge, but will not contribute toward causing the merge in the first
> place.  In a use case where you only add items and never actually
> delete anything, a merge will never be dynamically triggered.
>
> It is plausible that we could add some expiry-statistics measurement
> and triggering to bitcask, but today that's the state of things.  You
> could manually trigger merges, but that currently requires a bit of
> Erlang.
>
> I hope that this helps.
>
> -Justin
>




More information about the riak-users mailing list