Is storing billions of small files a good Riak-CS/KV usecase?
dzagidulin at basho.com
Wed Oct 7 10:23:41 EDT 2015
On second thought, ignore the Search recommendation. Search + Expiry
doesn't work very well (when objects expire from Riak, their search index
entries persist, except now those are orphaned).
On Wed, Oct 7, 2015 at 4:11 PM, Dmitri Zagidulin <dzagidulin at basho.com>
> Hi David,
> 1) Storing billions of small files is definitely a good use case for Riak
> KV. (Since they're small, there's no reason to use CS (now re-branded as
> 2) As far as deleting an entire bucket, that part is tougher.
> (Incidentally, if you were thinking of using Riak CS because it has a
> 'delete bucket' command (see
> ) -- that won't work, the delete bucket command requires all objects to be
> deleted first. Meaning, you can only perform it on an empty bucket. Which
> doesn't help you :) ).
> Your best bet is to use the Bitcask back end (instead of leveldb), and use
> its Automatic Expiration setting (see the end of the
> section, under 'Automatic Expiration').
> So, you can say:
> bitcask.expiry = 30d
> And all of the objects (in all buckets using that backend) will be expired
> 30 days (from their last-modified timestamp). Which also effectively
> deletes the bucket they were in.
> Now, this setting is per-backend. So, if you need other buckets without
> expirations, you'd want to set up a Multi backend. So, the default backend
> could be a plain non-expiry Bitcask (or leveldb), and then a second backend
> would have the expiry setting. You can learn more here:
> What if you want to delete buckets but also use LevelDB?
> That depends on why you want LevelDB. If you're using it for Secondary
> Index capability -- you can use Search instead, that works with Bitcask.
> Or, on the flip side, you can use 2i (secondary index) queries to delete
> the bucket. You'd use a 2i query to get all the keys in an expiring bucket,
> and then issue Deletes to each key. (Don't forget to delete with a W value
> equal to your N value. Or you may have to run the query + delete a few
> times, to account for stray missing replicas).
> Does that help explain the situation?
> On Wed, Oct 7, 2015 at 3:43 PM, David Heidt <david.heidt at msales.com>
>> Hi List,
>> would you say that storing billions of very small (json) files is a good
>> usecase for riak kv or cs?
>> here's what I would do:
>> * create daily buckets ( i.e. 2015-10-07)
>> * up to 130 Million inserts per day
>> * about 150.000 read-ony accesses/day
>> * no updates on existing keys/files
>> * delete buckets (including keys/files) older than x days
>> I already have a working riak-kv/leveldb cluster (inserts and lookups are
>> going smoothly), but when it comes to mass deletion of keys I found no way
>> to do this.
>> riak-users mailing list
>> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users