Parameter Planning (eleveldb)

Simon Effenberg seffenberg at team.mobile.de
Mon Feb 4 02:18:31 EST 2013


Thanks again Matthew,

I think for now I can start with this. I'll have 256/6 partitions per
node and so if there is one dieing the others have to handle it better
so 256/5 per node multiplied by 92 is 4711 as ulimit setting, right?

One question coming into my mind, the Tips & Tricks section talks about
the noop elevator/disk scheduler whereas this post:
http://riak-users.197444.n3.nabble.com/Riak-performance-problems-when-LevelDB-database-grows-beyond-16GB-tp4025608p4025622.html
is talking about deadline for spinning disks (what we will have). So
who is right or who is outdated?

Thanks again for your help!!

Cheers,
Simon

On Sun, 3 Feb 2013 17:55:38 -0500
Matthew Von-Maszewski <matthewv at basho.com> wrote:

> I will assume you use the default write buffer settings … cuz that is a whole different discussion and there are two settings not one (_min and _max).
> 
> The default min is 32M and the default max is 64M … so your value is 48M for the average_write_buffer_size.
> 
> Superbowl is about to start here, so I am not performing a detailed check of your math.  However, I can judge the 92 open files looks correct compared to similar systems.
> 
> What questions remain?
> 
> Matthew
> 
> 
> On Feb 3, 2013, at 5:44 PM, Simon Effenberg <seffenberg at team.mobile.de> wrote:
> 
> > Hi Matthew,
> > 
> > thanks a lot!
> > 
> > So now I have:
> > 
> > 6 nodes each having 32GB RAM:
> > 
> > vnode_working_memory = 16GB / 256 / 6 (50% of RAM devided by ringsize
> > devided by nodes) = 390 MB
> > 
> > open_file_memory =
> >   (max_open_files-10) * (
> >    184 + (104MB/2048) *
> >    (8 + ((16+14336)/2048 +1) *
> >    0.6
> >   )
> > 
> > Now I'm missing the max_open_files .. how to calculate it?
> > I'm missing also average_write_buffer_size (see my question in Step 4).
> > 
> > If I would use the default values for average_write_buffer_size the
> > max_open_files could be calculated like:
> > 
> > memory/vnode = average_write_buffer_size + cache_size +
> > open_file_memory + 20 MB
> > <=> (memory/vnode) - 20 MB - cache_size - average_write_buffer_size =
> >    open_file_memory
> > 
> > so with the default values:
> > 
> > open_file_memory = 390MB - 20MB - 8MB - 45MB = 317MB
> > 
> > and now max_open_files would be
> > 
> > open_file_memory = (max_open_files-10) * (184 + (104MB/2048) * (8 + ((16
> >                   +14336)/2048 +1) * 0.6 )
> > <=> (max_open_files-10) = open_file_memory / (184 + (104MB/2048) * (8 +
> >                          ((16+14336)/2048 +1) * 0.6 )
> > <=> max_open_files = open_file_memory / (184 + (104MB/2048) * (8 +
> >                     ((16+14336)/2048 +1) * 0.6 ) + 10
> > <=> max_open_files = 317MB / (184+53248*(8+67.8)) + 10
> > <=> max_open_files = 317MB / 4036382.4 + 10
> > <=> max_open_files ~= 92
> > 
> > That would be the maximum amount of open files a server can handle
> > (per vnode), am I right? But now, is this enough? Or how to calculate
> > 50% temporary server loss (3 of 6) and how is the count of keys/values
> > is taking into account? I'm somehow lost :(
> > 
> > Cheers
> > Simon
> > 
> > On Sun, 3 Feb 2013 16:12:25 -0500
> > Matthew Von-Maszewski <matthewv at basho.com> wrote:
> > 
> >> First:  Step 2 is talking about how many vnodes exist on a physical server.  If your ring size is 256, but you have 8 servers … then your vnode count for step 2 is 32.
> >> 
> >> Second:  the 2048 is a constant forced by Google's leveldb implementation.  It is the portion of a file covered by a single bloom filter.  This calculation constant disappears with the upcoming 1.3 release.
> >> 
> >> Third:  yes there is a "block_size" parameter that is 4096.  Increase that only if you want to REDUCE the performance of the leveldb instance.  4096 is a very happy value.  We have customers and tests with 130K data values, all using 4096 block size.  The block_size only governs the minimum written (aggregate size of small values that must be written as one unit at minimum).
> >> 
> >> Use 104Mbyte for your average sst file size.  It is "good enough"
> >> 
> >> 
> >> I am not following the question stream for Step 4 and beyond.  Please state again.
> >> 
> >> Matthew
> >> 
> >> 
> >> 
> >> 
> >> On Feb 3, 2013, at 3:44 PM, Simon Effenberg <seffenberg at team.mobile.de> wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I'm not sure if I understand this all well to calculate the memory
> >>> usage per file and other stuff.
> >>> 
> >>> The webpage tells me some steps but I'm completly unsure if I understand all parameters.
> >>> 
> >>> "Step 1: Calculate Available Working Memory"
> >>> 
> >>> taking the example:
> >>> 
> >>> leveldb_working_memory = 32G * (1 - .50) = 16G
> >>> 
> >>> "Step 2: Calculate Working Memory per vnode"
> >>> 
> >>> vnode_working_memory = leveldb_working_memory / vnode_count
> >>> 
> >>> vnode_count = 256
> >>> 
> >>> => vnode_working_memory = 16G / 256 = 64MB/vnode
> >>> 
> >>> also easy
> >>> 
> >>> "Step 3: Estimate Memory Used by Open Files"
> >>> 
> >>> open_file_memory =
> >>>  (max_open_files-10) * (
> >>>    184 + (average_sst_filesize/2048) *
> >>>    (8 + ((average_key_size+average_value_size)/2048 +1) *
> >>>    0.6
> >>>  )
> >>> 
> >>> so how do I know the average_sst_filesize (and what is this value exactly)
> >>> (and is 2048 for both /2048 true or 4096 in riak 1.2?) and how do I know
> >>> the max_open_files?
> >>> 
> >>> 
> >>> average_key_size could be 16byte (I have to ask someone but taking it for now)
> >>> average_value_size will be 14kbyte 
> >>> 
> >>> so for now
> >>> 
> >>> open_file_memory =
> >>>  (max_open_files-10) * (
> >>>    184 + (average_sst_filesize/2048) *
> >>>    (8 + ((16+14336)/2048 +1) *
> >>>    0.6
> >>>  )
> >>> 
> >>> (side question: should I increase the block_size because of the big average value size?
> >>> and also should I leave the cache_size at the default value like it was recommended?)
> >>> 
> >>> "Step 4: Calculate Average Write Buffer"
> >>> 
> >>> should I increase these values or not? If only two are held in memory and I have, as an
> >>> example, 32GB or RAM like in this scenario, shouldn't I increase it to something else?
> >>> 
> >>> "Step 5: Calculate vnode Memory Used"
> >>> 
> >>> memory/vnode = average_write_buffer_size + cache_size + open_file_memory + 20 MB
> >>> 
> >>> So for now I miss almost all 3 values :(.
> >>> 
> >>> To get an Idea:
> >>> 
> >>> - 3 buckets
> >>> - overall ~ 343347732 keys (but only 2/3 have 14kbyte in average)
> >>> 
> >>> 
> >>> Thx for help!
> >>> Simon
> >>> 
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> riak-users at lists.basho.com
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >> 
> > 
> > 
> > -- 
> > Simon Effenberg | Site Ops Engineer | mobile.international GmbH
> > Fon:     + 49-(0)30-8109 - 7173
> > Fax:     + 49-(0)30-8109 - 7131
> > 
> > Mail:     seffenberg at team.mobile.de
> > Web:    www.mobile.de
> > 
> > Marktplatz 1 | 14532 Europarc Dreilinden | Germany
> > 
> > 
> > Geschäftsführer: Malte Krüger
> > HRB Nr.: 18517 P, Amtsgericht Potsdam
> > Sitz der Gesellschaft: Kleinmachnow 
> 


-- 
Simon Effenberg | Site Ops Engineer | mobile.international GmbH
Fon:     + 49-(0)30-8109 - 7173
Fax:     + 49-(0)30-8109 - 7131

Mail:     seffenberg at team.mobile.de
Web:    www.mobile.de

Marktplatz 1 | 14532 Europarc Dreilinden | Germany


Geschäftsführer: Malte Krüger
HRB Nr.: 18517 P, Amtsgericht Potsdam
Sitz der Gesellschaft: Kleinmachnow 




More information about the riak-users mailing list