[ANN] Riak Release 0.11.0

Jason J. W. Williams jasonjwwilliams at gmail.com
Fri Jun 11 14:36:53 EDT 2010


Hi David,

That helps. I assumed it was non-blocking, but that compaction is a
pretty disk intensive operation while it's occurring. Just because
it's occurring on a single partition, the compaction I/O load will
affect the partitions on the same disk.

-J

On Fri, Jun 11, 2010 at 12:27 PM, David Smith <dizzyd at basho.com> wrote:
> Well, compaction does not mean that a partition is unavailable -- that is to
> say, compaction happens in a non-blocking manner. So worst, case you'lll
> have disk-related latency hits for a given partition, but requests should
> still be getting served. Also, a single node only compacts a single
> partition at a time.
> FWIW, in my own testing, with a 50/50 read/write mix, compaction (based on
> fragmentation %) typically doesn't happen that often, particularly when you
> have a cluster of machines.
> Hope that helps.
> D.
>
> On Fri, Jun 11, 2010 at 12:20 PM, Jason J. W. Williams
> <jasonjwwilliams at gmail.com> wrote:
>>
>> Is it smart enough to coordinate with the other partitions to ensure
>> not more than 25% (just a plug number) of the partitions are
>> compacting at the same time? It would seem to me there's the
>> possibility for a performance drop if you had the perfect storm of too
>> many shards compacting at the same time.
>>
>> -J
>>
>> On Fri, Jun 11, 2010 at 4:54 AM, Justin Sheehy <justin at basho.com> wrote:
>> > Hi, Germain.
>> >
>> > On Fri, Jun 11, 2010 at 11:07 AM, Germain Maurice
>> > <germain.maurice at linkfluence.net> wrote:
>> >
>> >> Because of its append-only nature, stale data are created, so, how does
>> >> Bitcask to remove stale data ?
>> >
>> > An excellent question, and one that we haven't yet written enough about.
>> >
>> >> With CouchDB the compaction process on our data never succeed, too much
>> >> data.
>> >> I really don't like to have to launch manually this kind of process.
>> >
>> > Bitcask's merging (compaction) process is automated and very tunable.
>> > These parameters are the most relevant in your bitcask section of
>> > app.config:
>> >
>> > (see the whole thing at
>> > http://hg.basho.com/bitcask/src/tip/ebin/bitcask.app)
>> >
>> > %% Merge trigger variables. Files exceeding ANY of these
>> > %% values will cause bitcask:needs_merge/1 to return true.
>> > %%
>> > {frag_merge_trigger, 60},              % >= 60% fragmentation
>> > {dead_bytes_merge_trigger, 536870912}, % Dead bytes > 512 MB
>> >
>> > %% Merge thresholds. Files exceeding ANY of these values
>> > %% will be included in the list of files marked for merging
>> > %% by bitcask:needs_merge/1.
>> > %%
>> > {frag_threshold, 40},                  % >= 40% fragmentation
>> > {dead_bytes_threshold, 134217728},     % Dead bytes > 128 MB
>> > {small_file_threshold, 10485760},      % File is < 10 MB
>> >
>> > Every few minutes, the Riak storage backend for a given partition will
>> > send a message to bitcask, requesting that it queue up a possible
>> > merge job.  (only one partition will be in the merge process at once
>> > as a result of that queue)  The bitcask application will examine that
>> > partition when that request reaches the front of the queue.  If any of
>> > the trigger values have been exceeded, then all of the files in that
>> > partition which exceed any threshold values will be run through
>> > compaction.
>> >
>> > This allows you a great deal of flexibility in your demands, and also
>> > provides reasonable amortization of the cost since each partition is
>> > processed independently.
>> >
>> > -Justin
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>



More information about the riak-users mailing list