Importing data to Riak

Nitish Sharma sharmanitishdutt at gmail.com
Tue Nov 15 12:08:12 EST 2011


Hi,
I tried importing the data using Python library (with protocol buffers).
After storing several objects, I get thread exception with timeout errors.
Following is the traceback:

  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 505, in run
    self.__target(*self.__args, **self.__kwargs)
  File "python_load_data.py", line 23, in worker
    new_obj.store()
  File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/riak_object.py",
line 296, in store
    Result = t.put(self, w, dw, return_body)
  File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
line 188, in put
    msg_code, resp = self.recv_msg()
  File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
line 370, in recv_msg
    raise Exception(msg.errmsg)
Exception: timeout

The cluster consists of 3 nodes (Ubuntu 10.04).
Any Suggestions?

Cheers
Nitish

On Mon, Nov 14, 2011 at 2:20 PM, Andres Jaan Tack <andres.jaan.tack at eesti.ee
> wrote:

> I was able to achieve similar results. I wrote a Ruby process that would
> keep at most n (I think n = 10) things at once and reached 2,500ish req/s
> on my macbook pro.
>
> I loaded data to a cluster of six Riak nodes by running several of these
> processes at once and attaching each to a different Riak node, and I hit
> 18,000 req/s. I'm not sure whether loading different nodes affected the
> speed or not, now that I think of it.
>
>
> 2011/11/14 Russell Brown <russelldb at basho.com>
>
>>
>> On 14 Nov 2011, at 11:47, Nitish Sharma wrote:
>>
>> > Hi,
>> > This is more sort of a discussion than a question. I am just trying to
>> see the trend in how users import their data to Riak.
>> > For the data I am using, I was able to achieve almost 150
>> records/second with PHP library, and 400 records/second with node.js
>> (fairly new with node; was hitting memory wall when trying to import 1
>> million records).
>> > What are some hacks/tricks/tweaks to import large amount of data to
>> Riak?
>>
>> New keys, new data, straight in for the first time, no fetch before
>> store? I've had reasonable results creating a *number* of threads and using
>> the Java Raw PB client to write.
>>
>> For example, maybe have a 1 or a couple of threads that reads data (from
>> Oracle, a file, what-have-you) and puts it on a queue, and have a bunch of
>> threads that pull data off the queue, create a riak object and store it.
>> From my laptop I've got up to 2500 writes a second like this, and it was
>> just ad hoc, throw away code with 4 threads against a small 3 node cluster
>> (running on desktops.)
>>
>> I imagine others on the list have more direct, real world examples?
>>
>> Cheers
>>
>> Russell
>>
>> >
>> > Cheers
>> > Nitish
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-users at lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111115/a240e691/attachment.html>


More information about the riak-users mailing list