innostore performance tuning

David Koblas david at koblas.com
Tue Aug 30 12:34:35 EDT 2011


Yes - but the thought of sorting 800M records which are all about 8k in 
size is a little daunting...  Something like a 6TB sort...  Plus it 
doesn't answer the ongoing insert problem, which is 20 keys/sec isn't 
functional.

--david

On 8/30/11 9:27 AM, Kresten Krab Thorup wrote:
> If you can insert the objects in ascending key order, then innostore will be much faster than a random insert.
>
> Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab
> Trifork A/S   |  Margrethepladsen 4  | DK- 8000 Aarhus C |  Phone : +45 8732 8787  |  www.trifork.com<http://www.trifork.com/>
>
> Trifork organizes the world class conference on software development: GOTO Aarhus<http://www.gotocon.com/>  - check it out!
>
> [cid:part1.09040606.08080401 at trifork.com]
>
> On Aug 30, 2011, at 6:14 PM, David Koblas wrote:
>
> I'm currently working on importing a very large dataset (800M) into Riak and running into some serious performance problems.  Hopefully this is just configuration issues and nothing deeper...
>
> Hardware -
>    * 8 proc box
>    * 32 Gb ram
>    * 5TB disk - RAID10
>
> Have a cluster of 4 for these boxes all running riak - riak configuration options that are different from stock:
>
>    * Listening on all IP address "0.0.0.0"
>    * {storage_backend, riak_kv_innostore_backend},
>    * innostore section - {buffer_pool_size, 17179869184}, %% 16GB
>    * innostore section - {flush_method, "O_DIRECT"}
>
> What I see is that the performance of my import script runs at about 200...300 keys per/second for keys that it's seen recently (e.g. re-runs) then drops to 20ish keys per/sec for new keys.
> STATS: 1000 keys handled in 3 seconds 250.75 keys/sec
> STATS: 1000 keys handled in 3 seconds 258.20 keys/sec
> STATS: 1000 keys handled in 4 seconds 240.11 keys/sec
> STATS: 1000 keys handled in 5 seconds 177.63 keys/sec
> STATS: 1000 keys handled in 4 seconds 246.26 keys/sec
> STATS: 1000 keys handled in 5 seconds 184.79 keys/sec
> STATS: 1000 keys handled in 5 seconds 195.95 keys/sec
> STATS: 1000 keys handled in 47 seconds 21.02 keys/sec
> STATS: 1000 keys handled in 44 seconds 22.63 keys/sec
> STATS: 1000 keys handled in 42 seconds 23.64 keys/sec
> STATS: 1000 keys handled in 43 seconds 22.88 keys/sec
> STATS: 1000 keys handled in 45 seconds 22.12 keys/sec
> STATS: 1000 keys handled in 43 seconds 22.83 keys/sec
> STATS: 1000 keys handled in 43 seconds 23.11 keys/sec
> Of course with 800M records to import a performance of 20 keys/sec is not useful, plus as time goes on having an insert rate at that level is going to be problematic.
>
> Questions -
>    Is there additional things to change for imports and datasets on this scale?
>    Is there a way to get additional debugging to see where the performance issues are?
>
> Thanks,
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com<mailto:riak-users at lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>




More information about the riak-users mailing list