slow 2 node cluster

Aphyr aphyr at aphyr.com
Sun Nov 20 15:56:09 EST 2011


On 11/20/2011 12:14 PM, Catalin Constantin wrote:
> I am 100% sure the transfer rate is 10MBytes / second. This is not the
> problem.

In ten years of network administration I have never encountered an 
ethernet device with a wire rate of 10 MBps. I have, however, 
encountered frequent confusion over units. Perhaps you understand my 
suspicion here. :)

> IOWAIT is also pretty low. iostat shows: iowait 4.69%

If you read my email, I suggested that even values as low as 2.6% may 
suggest contention. It depends on your CPU arch and utilization. I would 
investigate your disks more closely. Are they spinning or solid-state? 
Median seek time? 95/99 seek times? Disk cache? Does the riak process 
spend disproportionate time in IO_WAIT relative to USER? Filesystem 
atime/relatime/noatime? FS block size properly aligned for your disk? 
Insufficient filesystem or leveldb cache? hdparm options? Is riak on an 
independent disk or competing with other processes, i.e. syslog, file 
servers, etc? Does strace show an unusual amount of time spent in 
certain system calls?

It might just be leveldb, too. I only have experience with bitcask in 
production.

> I have retried the test with one node, a new bucket newly created where
> i have set: n to 1.
> bucket.set_n_val(1)
>
> Results are the same. Less than 300 inserts / second.

This is good; it rules out replication.

> Any idea why riak is so slow on inserting data ?

Disk, disk disk disk, disk disk? Disk!

You could also look at the client. Are you writing to a local node or a 
remote one? Is your client's threading model getting in the way? Can you 
actually produce data fast enough to insert it? Is your client fighting 
for the same resources as the riak process? Presuming you've ruled these 
out, it's almost certainly network or disk.

--Kyle




More information about the riak-users mailing list