Python client performance issue

Nico Meyer nico.meyer at adition.com
Tue Feb 15 06:09:07 EST 2011


Hi Andy.

I am not quite sure what you mean, is the protobuf library included with
riak-python-client? Or are you talking about the version of the protobuf
compiler that was used to create riakclient_pb2.py from 
riakclient.proto?

Cheers,
Nico

Am Dienstag, den 15.02.2011, 02:23 -0800 schrieb Andy Gross:
> python-riak-client already uses version 2.3.0.  Adventurous types
> might want to check out https://github.com/Greplin/fast-python-pb,
> which wraps the C/C++ protocol buffers library. 
> 
> -- 
> Andy Gross
> Principal Architect
> Basho Technologies, Inc.
> 
> 
> On Tuesday, February 15, 2011 at 1:46 AM, Nico Meyer wrote:
> 
> > Hi Mike,
> > 
> > perhaps you can try to upgrade the protocol buffers library to at
> > least
> > version 2.3.0. This is from the changelog for that version:
> > 
> > Python
> > * 10-25 times faster than 2.2.0, still pure-Python.
> > 
> > 
> > Cheers,
> > Nico
> > 
> > Am Montag, den 14.02.2011, 19:35 -0500 schrieb Mike Stoddart:
> > > Will do when I get time. Would the REST API be any faster?
> > > 
> > > Thanks
> > > Mike
> > > 
> > > On Mon, Feb 14, 2011 at 7:01 PM, Thomas Burdick
> > > <tburdick at wrightwoodtech.com> wrote:
> > > > I would highly recommend looking in to the cProfile and pstat
> > > > module and
> > > > profile the code that is going slow. If your using the protocol
> > > > buffer
> > > > client it could possibly be related to the fact that python
> > > > protocol buffers
> > > > is extraordinarily slow and is well known to be slow. Profile
> > > > until proven
> > > > guilty though.
> > > > Tom Burdick
> > > > 
> > > > On Mon, Feb 14, 2011 at 7:09 AM, Mike Stoddart
> > > > <stodge at gmail.com> wrote:
> > > > > 
> > > > > I added some code to my system to test writing data into Riak.
> > > > > I'm
> > > > > using the Python client library with protocol buffers. I'm
> > > > > writing a
> > > > > snapshot of my current data, which is one json object
> > > > > containing on
> > > > > average 60 individual json sub-objects. Each sub object
> > > > > contains about
> > > > > 22 values.
> > > > > 
> > > > > # Archived entry. ts is a formatted timestamp.
> > > > > entry = self._bucket.new(ts, data=data)
> > > > > entry.store()
> > > > > 
> > > > > # Now write the current entry.
> > > > > entry = self._bucket.new("current", data=data)
> > > > > entry.store()
> > > > > 
> > > > > I'm writing the same data twice; the archived copy and the
> > > > > current
> > > > > copy, which I can easily retrieve later. Performance is lower
> > > > > than
> > > > > expected; top is showing a constant cpu usage of 10-12%.
> > > > > 
> > > > > I haven't decided to use Riak; this is to help me decide. But
> > > > > for now
> > > > > are there any optimisations I can do here? A similiar test
> > > > > with
> > > > > MongoDB shows a steady cpu usage of 1%. The cpu usages are for
> > > > > my
> > > > > client, not Riak's own processes. The only differences in my
> > > > > test app
> > > > > is the code that writes the data to the database. Otherwise
> > > > > all other
> > > > > code is 100% the same between these two test apps.
> > > > > 
> > > > > Any suggestions appreciated.
> > > > > Thanks
> > > > > Mike
> > > > > 
> > > > > _______________________________________________
> > > > > riak-users mailing list
> > > > > riak-users at lists.basho.com
> > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > > > 
> > > 
> > > _______________________________________________
> > > riak-users mailing list
> > > riak-users at lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > 
> > 
> > 
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> 
> 






More information about the riak-users mailing list