Measuring Riak disk usage

Ben McCann ben at
Wed Apr 10 00:39:03 EDT 2013


I'm currently storing data in MongoDB and would like to evaluate Riak as an
alternative. Riak is appealing to me because LevelDB uses Snappy, so I
would expect it to take less disk space to store my data set than MongoDB
which does not use compression. However, when I benchmarked it by inserting
a few hundred thousand JSON records into each datastore, Riak in fact took
far more disk space. I'm wondering if there's something I might be missing
here as a newcomer to Riak. E.g. I checked the disk space used by running
"du -ch /var/lib/riak/leveldb". Is this perhaps not a good way to check
disk space usage because perhaps Riak/LevelDB preallocates files? (I know
MongoDB does this and has a built-in db.collection.stats command to provide
true disk usage information). Are there any other reasons why Riak might be
taking more space or anything I could have screwed up?


