Importing data to Riak
sharmanitishdutt at gmail.com
Tue Nov 15 12:08:12 EST 2011
I tried importing the data using Python library (with protocol buffers).
After storing several objects, I get thread exception with timeout errors.
Following is the traceback:
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
File "/usr/lib/python2.7/threading.py", line 505, in run
File "python_load_data.py", line 23, in worker
line 296, in store
Result = t.put(self, w, dw, return_body)
line 188, in put
msg_code, resp = self.recv_msg()
line 370, in recv_msg
The cluster consists of 3 nodes (Ubuntu 10.04).
On Mon, Nov 14, 2011 at 2:20 PM, Andres Jaan Tack <andres.jaan.tack at eesti.ee
> I was able to achieve similar results. I wrote a Ruby process that would
> keep at most n (I think n = 10) things at once and reached 2,500ish req/s
> on my macbook pro.
> I loaded data to a cluster of six Riak nodes by running several of these
> processes at once and attaching each to a different Riak node, and I hit
> 18,000 req/s. I'm not sure whether loading different nodes affected the
> speed or not, now that I think of it.
> 2011/11/14 Russell Brown <russelldb at basho.com>
>> On 14 Nov 2011, at 11:47, Nitish Sharma wrote:
>> > Hi,
>> > This is more sort of a discussion than a question. I am just trying to
>> see the trend in how users import their data to Riak.
>> > For the data I am using, I was able to achieve almost 150
>> records/second with PHP library, and 400 records/second with node.js
>> (fairly new with node; was hitting memory wall when trying to import 1
>> million records).
>> > What are some hacks/tricks/tweaks to import large amount of data to
>> New keys, new data, straight in for the first time, no fetch before
>> store? I've had reasonable results creating a *number* of threads and using
>> the Java Raw PB client to write.
>> For example, maybe have a 1 or a couple of threads that reads data (from
>> Oracle, a file, what-have-you) and puts it on a queue, and have a bunch of
>> threads that pull data off the queue, create a riak object and store it.
>> From my laptop I've got up to 2500 writes a second like this, and it was
>> just ad hoc, throw away code with 4 threads against a small 3 node cluster
>> (running on desktops.)
>> I imagine others on the list have more direct, real world examples?
>> > Cheers
>> > Nitish
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> riak-users mailing list
>> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users