Durable writes and parallel reads

Erik Søe Sørensen ess at trifork.com
Tue Nov 1 10:53:12 EDT 2011

On 31-10-2011 17:38, Erik Søe Sørensen wrote:
> Parallel Reads.
> ---------------
> Within a vnode, bitcask read operations happen in serial.
> Is there any reason for reads not happening in parallel?

I've made a small test of this - just to check that my intuition isn't 
off track.

In the test, I
- create a 2GB file
- clear the disk caches
- From Erlang, read 1000 randomly-placed 1KB blocks from the file.
The last two steps are repeated for different read strategies.

On my setup (Ubuntu laptop), I get the following read timings (per block):
- Calling file:pread/3 in one process:  8.2ms
- Same, but sort the reads by position: 5.7ms
- Calling file:pread/3 from separate processes (limited to 20 
simultaneous outstanding reads):  5.8ms
- Calling file:pread/3 from separate processes (limited to 50 
simultaneous outstanding reads):  5.4ms

(NB: This only works if a separate file descriptor is used for each 
read, otherwise no improvement is observed.)

This means that read ordering really does matter - and that the 
potential performance gains may be as much as 50% (i.e. significant).

As to whether this also holds in a Riak context, I've tried starting 
multiple simultaneous instances of these strategies, each working on 
different files (simulating multiple vnodes working from the same disk), 
and observed similar improvements (30-45% for three instances).

(For completeness, I must add that this may be highly I/O system 
dependent. The above numbers are from the 'anticipatory' I/O scheduler 
strategy for Linux; switch to the 'CFQ' strategy reduces the benefits a 
lot - and also makes the absolute numbers worse.)

Erik Søe Sørensen
Trifork A/S

[Code is available on request.]

More information about the riak-users mailing list