Data loads

Pinney Colton pinney.colton at bitwisedata.com
Thu Aug 30 10:59:20 EDT 2012


Hi all -

This is my first post to the list.  I'm a relative Riak newbie, though I
have some experience working with multi-terabyte datasets on other
platforms.  Last night, I kicked off my first "large" load of data as a
test of the platform with about 1,000,000 json objects being loaded into a
bucket.  I had a couple performance issues, so I'm wondering if someone on
the list could be kind enough to answer a few questions that will help me
troubleshoot.

a) I haven't analyzed all of my load log data yet, but it look like writes
went from about 0.02 seconds per object to a couple minutes per object!
 This is the typical "dev" setup from the tutorials, and I forgot to divide
available RAM by 4 to arrive at a number per node - is this likely the
result of a memory constraint, or should I be looking elsewhere, beyond
just bumping the memory on my VM?  I looked at the logs, but I'm not sure
what I should be looking for.

b) I am using protocol buffers, and I saw similar initial performance when
running the load from a separate machine vs. having the data on the riak
machine itself.  Is that what you would recommend?  I'm wondering if there
is any hard/fast rule re: CPU/Memory contention on the machine vs. network
performance of loading from a different machine.

c) I'm using a sha256 hash as my bucket name.  I read that buckets and keys
are concatenated internally and that all objects have just one "bucketkey".
 Am I putting significantly more pressure on memory by using such a long
bucket name?  Or is Riak managing that for me via some sort of compression?
 If that long hash is being replicated for each of those million objects, I
can see where my memory estimates would have been low.  I can always use an
integer ID for my bucket name, the hash just existed elsewhere in my
application, so I used it without thinking about it too much.

Thanks in advance for your help!  Loving Riak so far, in spite of these
trivial hurdles.

Regards,
Pinney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120830/47d4aba6/attachment.html>


More information about the riak-users mailing list