Riak performance problems when LevelDB database grows beyond 16GB

Matthew Von-Maszewski matthewv at basho.com
Mon Oct 22 12:20:23 EDT 2012


Jan,

I apologize for the delayed response.

1.  Did you realize that the "log_jan.txt" file from #1 below documented a hard disk failure?  You mentioned a failed drive once.  I am not sure if this is the same drive.


2.  The "sse4_2" tells me that your Intel cpu supports hardware CRC32c calculation.  This feature is not useful to you at this moment (unless you want to pull the mv-hardware-crc branch from basho/leveldb).  It will bring some performance improvements to you in the next release IF we do not decide that your problems are hard disk performance limited.


3.  This just confirmed for me that the app.config is not accidentally blowing your physical memory.  The app.config file you posted suggested this was not the case, but I wanted to verify.

You also discuss a basho_bench failure.  Is this the same test run as the log_jan.txt file?  The hard drives had their first failure at:

2012/10/18-02:08:44.136238 7f8297fff700 Compaction error: Corruption: corrupted compressed block contents

And things go really bad at:

2012/10/18-06:10:37.657072 7f829effd700 Moving corrupted block to lost/BLOCKS.bad (size 1647)


4.  I was looking to see if your data was compressing well.  The answer it that it is.  You are achieving 2 to 2.6x compression ratio.  Since you are concerned about throughput, I was verifying that the time leveldb spends on block compression is worthwhile for you (it is).


The next question from me is whether the drive / disk array problems are your only problem at this point.  The data in log_jan.txt looks ok until the failures start.  I am willing to work more, but I need to better understand your next level of problems.

Matthew


On Oct 19, 2012, at 7:49 AM, <Jan.Evangelista at seznam.cz> <Jan.Evangelista at seznam.cz> wrote:

> Hi Matthew,
> 
> big thanks for responding. I see that you are the main committer to Basho's 
> leveldb code. :-)
> 
>> 1.  Execute the following on one of the servers:
>> sort /home/riak/leveldb/*/LOG* >log_jan.txt
> 
> See the attached file log_jan.txt.gz. It is from the stalled Riak node.
> 
>> 2.  Execute the following on one of the servers:
>>  grep -i flags /proc/cpuinfo
> 
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm 
> constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc 
> aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr 
> pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx lahf_lm arat epb 
> xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
> 
>> 3.  On a running server that is processing data, execute:
>> grep -i swap /proc/meminfo
> 
> I will restart the test and report back when it stalls again. In the meantime,
>  am sending you yesterday's zabbix graph showing memory usage on the node 
> (attached file ZabbixMemory.png). The time when the node stopped responding is
> logged as:
> 
> 2012-10-18 08:28:47.537 [error]  ** Node 'riak at 172.16.0.6' not responding **
> 
> I am also attaching the corresponding Basho bench output of the test. The test
> was started on Oct 17 16:38 with an empty database, and it was run on a plain
> ext4 partition (no RAID).
> 
>> 4.  Pick a server, then one directory in /home/riak/leveldb.  Select 3 of 
> the largest *.sst files.  Tar/gzip those and email back.
> 
> I will send them in the next mail. I can also put the entire database 
> somewhere for you for download, if you need it.
> 
> Thanks, Jan<log_jan.txt.gz><ZabbixMemory.png><4Riak_2K_2.1RC2_noraid.png>





More information about the riak-users mailing list