multiget in python client hangs after timeout

Jeff Peck jeffp at tnrglobal.com
Tue Jan 21 13:54:56 EST 2014


Hello,

We are using Riak's multiget function from the Python API, and we have noticed that our service would sometimes hang and need to be restarted. After some searching, we found that the cause was Riak's multiget function.

Consider the following simple example where we are using multiget to only get a single object and we are passing a timeout of 1 millisecond.

import riak, riak_pb

riak_client = riak.RiakClient(host='localhost,
                              pb_port=8087, protocol='pbc')

for n in range(0,10: 
    print riak_client.multiget([('test_bucket', '398eed5613da8d0918cd64b3cf1d44b2')], 
                               timeout=1) #ms


A RiakError is raised, but the script just hangs after that and can only be stopped with a signal 9:

Exception in thread riak.client.multiget-worker-0:
Traceback (most recent call last):
  File "/usr/local/lib/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/local/lib/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.6/site-packages/riak/client/multiget.py", line 130, in _fetcher
    **task.options)
  File "/usr/local/lib/python2.6/site-packages/riak/bucket.py", line 206, in get
    return obj.reload(r=r, pr=pr, timeout=timeout)
  File "/usr/local/lib/python2.6/site-packages/riak/riak_object.py", line 307, in reload
    self.client.get(self, r=r, pr=pr, timeout=timeout)
  File "/usr/local/lib/python2.6/site-packages/riak/client/transport.py", line 127, in wrapper
    return self._with_retries(pool, thunk)
  File "/usr/local/lib/python2.6/site-packages/riak/client/transport.py", line 69, in _with_retries
    return fn(transport)
  File "/usr/local/lib/python2.6/site-packages/riak/client/transport.py", line 125, in thunk
    return fn(self, transport, *args, **kwargs)
  File "/usr/local/lib/python2.6/site-packages/riak/client/operations.py", line 315, in get
    return transport.get(robj, r=r, pr=pr, timeout=timeout)
  File "/usr/local/lib/python2.6/site-packages/riak/transports/pbc/transport.py", line 146, in get
    MSG_CODE_GET_RESP)
  File "/usr/local/lib/python2.6/site-packages/riak/transports/pbc/connection.py", line 43, in _request
    return self._recv_msg(expect)
  File "/usr/local/lib/python2.6/site-packages/riak/transports/pbc/connection.py", line 55, in _recv_msg
    raise RiakError(err.errmsg)
RiakError: 'timeout'


We are not able to catch the exception because it occurs in a different thread. So, when this occurs, it renders our service unusable until a restart is issued.

I would like to know if anybody else has encountered this and if there are any known work-arounds or a patch for riak/client/multiget.py that will propagate the timeout exception so the main thread no longer waits for it to never return.

Thanks,
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20140121/2c39823e/attachment.html>


More information about the riak-users mailing list