Error while importing data

Nitish Sharma sharmanitishdutt at gmail.com
Wed Nov 23 05:50:10 EST 2011


So far I've figured out that this error has nothing to do with Python.
After couple of million iterations, one of the nodes (any random node) in
the cluster crashes and thus python threads time out.
I am trying to make sense out of error and crash logs.

Cheers
Nitish

On Sat, Nov 19, 2011 at 10:16 PM, Erik Søe Sørensen <ess at trifork.com> wrote:

> A timeout... Do you know what the timeout threshold is? Have you tried
> increasing it (if possible; I don't know the Python client) or simply
> retrying once or twice on timeout?
> Also, what backend is Riak configured with? - I believe eleveldb has
> occasional lower throughput/higher latency because of file compaction.
>
> ----- Reply message -----
> Fra: "Nitish Sharma" <sharmanitishdutt at gmail.com>
> Dato: lør., nov. 19, 2011 13:22
> Emne: Error while importing data
> Til: "riak-users" <riak-users at lists.basho.com>
>
> Hi,
> To give my Riak setup a good stress testing, I decided to import a large
> dataset (consisting of around 160 million records). But before importing
> the whole thing, I tested the import python script (using protocol buffers)
> using 1 million records, which was successful with ~2200 writes/sec. The
> script, essentially, puts the data into a queue and couple of threads gets
> the data from the queue and store it in Riak.
> When started with full dataset, after storing several million objects, I
> get thread exception with timeout errors.
> Following is the traceback:
>
>  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
>    self.run()
>  File "/usr/lib/python2.7/threading.py", line 505, in run
>    self.__target(*self.__args, **self.__kwargs)
>  File "python_load_data.py", line 23, in worker
>    new_obj.store()
>  File
> "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/riak_object.py",
> line 296, in store
>    Result = t.put(self, w, dw, return_body)
>  File
> "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
> line 188, in put
>    msg_code, resp = self.recv_msg()
>  File
> "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
> line 370, in recv_msg
>    raise Exception(msg.errmsg)
> Exception: timeout
>
> The cluster consists of 3 nodes (Ubuntu 10.04). The nodes have enough disk
> space; number of file handles used (~2500) are also within limit (32768);
> number of concurrent ports 32768. I cant figure out what else could be the
> possible reason for the exceptions.
>
> Any Suggestions?
>
> Cheers
> Nitish
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111123/6619a707/attachment.html>


More information about the riak-users mailing list