Measuring Riak disk usage

Matthew Von-Maszewski matthewv at basho.com
Wed Apr 10 14:44:11 EDT 2013


Paul,

There is a "tool" that will dump statistics for a given .sst file, including compression ratio.  It is really designed for in-house usage, so it can be a pain to build (just being honest, and setting expectations).

1. you have download and install libsnappy.  Just having it embedded within Riak is not enough.  It needs to be installed on your system like other libraries.

2. you have go into the leveldb source directory (maybe download that too) and execute "make tools"

You are looking for the "sst_stat" program.  You would execute something like this using your local paths:

./sst_stat /var/lib/riak/leveldb/*/sst_*/*.sst

The output can be loaded into a spreadsheet for analysis.

This is the only think I know to offer for your question.  Maybe others better alternatives.

Matthew



On Apr 10, 2013, at 2:25 PM, Paul Wagner <paul at luminoso.com> wrote:

> So as far as profiling disk usage of certain newly created objects for future planning is there a specific query that might return recently created objects and their size on disk?
> 
> Looking at the contents of .../riak/leveldb I can see each of the nodes and the sorted string tables. I can follow the performance [du- h] of the leveldb folder overall as data is added but the feature/command we're looking for is how we're going to profile now in order to plan for future disk usage.  Something like http://fqdn.hostname.com:8098/riak/usage perhaps...
> 
> Is this something that limited to Riak Enterprise?
> 
> P
> 
> On Wed, Apr 10, 2013 at 2:17 PM, Ben McCann <ben at benmccann.com> wrote:
> Cool. Thanks for the explanation.
> 
> 
> On Wed, Apr 10, 2013 at 10:55 AM, Reid Draper <reiddraper at gmail.com> wrote:
> 
> On Apr 10, 2013, at 1:45 PM, Jeremiah Peschka <jeremiah.peschka at gmail.com> wrote:
> 
>> If you've installed from the apt/yum repository you've installed a single Riak node on your machine. Riak, though, is configured by default to write data to three servers. If some of those servers aren't available, Riak is going to write to a different server via hinted handoff[1]. Since you are only running one node, that single node receives all copies of your data in the hopes that some day the other Riak servers in the cluster will come back for their data.
> 
> This isn't quite true. A single node cluster is just going to simply store three copies on 3 of its local vnodes. No hinted handoff is involved. If you add nodes to the cluster, as Evan said, data will be distributed amongst them.
> 
> Ben,
> 
> tl;dr, Riak will store three copies on your data, by default, even if you're using a single server.
> 
> Reid
> 
> 
> 
> 
> -- 
> about.me/benmccann
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130410/c6ae9198/attachment.html>


More information about the riak-users mailing list