Race condition reading objects
fearsome.lucidity at gmail.com
Sun Oct 30 19:58:07 EDT 2011
I am finding that there appears to be some sort of race condition when
reading recently written objects (as in concurrently). I am using Riak
1.0.0 with the leveldb backend through the multi backend in a 3 node
cluster. Writes are done with W=2 and reads with R=2. The client is using
the riak client Ruby gem.
The issue cropped up while working on a data loading script. The script
load data from a file and insert it into the cluster. It attempts to do so
in parallel, with configurable concurrency. This data is largely
non-repetitive. Usually an object is written once and has worked without
major issue. I recently changed the script to collect statistics on some
of the data being inserted, and insert the stats into a different bucket.
The stats are written in JSON and keyed by a value in the data being
loaded. The script will attempt to fetch the stats object for the key
currently under consideration, if it finds one merge the new stats, and
store the new or updated object.
Once some of the stats objects started to grow into the KB range, the
reading of some existing stat objects started to fail. Upon examination it
seems the data in the object was being truncated and thus riak client
failed to deserialize the object as it was no longer valid. But if I
fetched the object manually I was returned complete. I added a loop to the
script to retry such truncated fetches, and I found that they would succeed
after a few tries.
It would thus appear that Riak is making the new object available to be
fetched before its data is fully stored, leading to the apparently
truncated return. The issue only becomes visible once the object is large
enough to introduce enough delay in processing for store and fetch
operations to overlap. Using W=2 and R=2 probably has no effect as only
the vclocks are compared, not the actual data stored. Not sure if this is
an issue with the new leveldb backend or the KV code.
Anyone seen this? Is it expected behavior? Shouldn't the new object only
be exposed after it has been completed received and stored?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users