Python Driver Write Times
zapphutz at gmail.com
Sun Nov 28 15:41:18 EST 2010
>From just the switch to the PBC Transport, I see this kind of increase:
MakeRiakPerson took 40.139 ms
MakeRiakPerson took 40.472 ms
MakeRiakPerson took 40.651 ms
MakeRiakPerson took 51.630 ms
MakeRiakPerson took 36.733 ms
MakeRiakPerson took 1.989 ms
MakeRiakPerson took 2.650 ms
MakeRiakPerson took 1.523 ms
MakeRiakPerson took 3.707 ms
MakeRiakPerson took 7.213 ms
MakeRiakPerson took 3.141 ms
That almost pushes it down in and of itself to the speeds I got with
MongoDB. My next step will be moving to replicate an environment where I
have atleast 3 nodes up and running.
Thanks for the assistance, everyone.
Again, thanks to everyone who lent a hand. You've made a Riak-Believer out
On Sun, Nov 28, 2010 at 3:12 PM, Derek Sanderson <zapphutz at gmail.com> wrote:
> About VM resources: I had suspected there would be a hit in this sense, but
> I wasn't aware of any actual numbers. Thanks for that
> About 3 writes / the need for 2 more nodes: I had no idea about that (x# of
> writes per object). Riak is very unfamiliar territory for me. I'll read the
> guide that has been suggested and look at running my tests under a more
> "optimal" set up.
> Thanks to everyone who responded to this issue. If I get the numbers
> increased, I'll be sure to post a followup (in case anyone cares).
> On Sun, Nov 28, 2010 at 3:07 PM, Derek Sanderson <zapphutz at gmail.com>wrote:
>> I'm using the defaults for the python library, so that would be the HTTP
>> Rest interface. There is support for the PBC interface, which I'm looking
>> into using now.
>> I had suspected that since I wasn't really using Riak in such a way as to
>> let it shine (ie, in a cluster of nodes), that might be part of my problem.
>> Thanks so much for the detailed response.
>> On Sun, Nov 28, 2010 at 12:10 PM, Greg Steffensen <
>> greg.steffensen at gmail.com> wrote:
>>> This is due to two factors:
>>> 1) Durability. MongoDB stores writes in RAM and flushes them to disk
>>> periodically (by default, every 60 seconds, according to this page:
>>> http://www.mongodb.org/display/DOCS/Durability+and+Repair). This means
>>> that its writes can seem very, very fast, but if the machine goes down, you
>>> could lose up to 60 seconds of data. Riak writes don't return until the
>>> data has actually been persisted to disk. Casandra takes the same approach
>>> as MongoDB, with the same trade-off.
>>> 2) Parallelism. This test isn't taking advantage of Riak's distributed
>>> nature. Riak really shines when its run on a cluster of machines- you can
>>> make your write throughput almost arbitrarily fast, as long as you're
>>> willing add enough machines to the cluster.
>>> I doubt that you'll be able to get single-node Riak to write as fast as
>>> Mongo, but I'd guess that that numbers will get a little closer if you do
>>> several writes simultaneously in both by multi-threading using python's
>>> threading module. Also, be sure that you're using Riak's protocol buffers
>>> interface, instead of the REST (HTTP) one, which adds a lot of overhead- I
>>> believe the python client supports both.
>>> On Sun, Nov 28, 2010 at 11:48 AM, Derek Sanderson <zapphutz at gmail.com>wrote:
>>>> I've recently started to explore using Riak (v0.13.0-2) from Python
>>>> (v2.6.5) as a datastore, and I've run into a performance issue that I'm
>>>> unsure of the true origin of, and would like some input from users who have
>>>> been working with Riak and its Python drivers.
>>>> I have 2 tests set up, one for Riak and another for MongoDB, both using
>>>> their respectively provided Python drivers. I'm constructing chunks of JSON
>>>> data consisting of a Person, who has an Address, and a purchase history
>>>> which contains 1 to 20 line items with some data about the item name, cost,
>>>> # puchased, etc. A very simple mockup of a purchase history. It does this
>>>> for 1 million "people" (my initial goal was to see how lookups fared when
>>>> you reach 1m+ records)
>>>> When using MongoDB, the speed of inserts is incredibly fast. When using
>>>> Riak, however, there is a very noticeable lag after each insert. So much so
>>>> that when running side by side, the MongoDB test breaks into the 10,000s
>>>> before Riak hits it's first 1k.
>>>> My main PC is a Windows7 i7 quad core, with 8 gigs of ram, on which I'm
>>>> running Ubuntu64 v10.04 on a VM, which has 2GB of memory allotted. On this
>>>> VM, I have Riak and MongoDB running concurrently.
>>>> Here is a sample of how I'm using the Riak driver:
>>>> riak_conn = RiakClient()
>>>> bucket = riak_conn.bucket("peopledb")
>>>> for i in range(1,1000000):
>>>> new_obj = bucket.new("p" + str(i),MakePerson())
>>>> except Exception as e:
>>>> print e
>>>> I'm wondering if there is something blatantly wrong I'm doing. I didn't
>>>> see any kind of batch-store method on the bucket (instead of calling store
>>>> on each object, simply persist the entirety of the bucket itself), and I
>>>> wasn't sure if this was an issue with my particular setup (maybe the
>>>> specifics of my VM are somehow throttling its performance), or maybe just a
>>>> known limitation that I wasn't aware of.
>>>> To shed some light on the disparity, I re factored my persistence into
>>>> separate methods, and used a wrapper to pull out the execution times. Here
>>>> is a very condensed list of run times. The method in question, for both
>>>> datastores, simply creates a new "Person" and stores it. Nothing else.
>>>> MakeRiakPerson took 40.139 ms
>>>> MakeRiakPerson took 40.472 ms
>>>> MakeRiakPerson took 40.651 ms
>>>> MakeRiakPerson took 51.630 ms
>>>> MakeRiakPerson took 36.733 ms
>>>> MakeMongoPerson took 1.810 ms
>>>> MakeMongoPerson took 3.619 ms
>>>> MakeMongoPerson took 1.036 ms
>>>> MakeMongoPerson took 1.275 ms
>>>> MakeMongoPerson took 3.656 ms
>>>> Thankyou in advance for any help that can be offered here. I'm
>>>> incredibly new to Riak as a whole, as well as very inexperienced when it
>>>> comes to working in a *nix environment, so I imagine there are countless
>>>> ways I could have shot myself in the foot without realizing it.
>>>> riak-users mailing list
>>>> riak-users at lists.basho.com
>>> riak-users mailing list
>>> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users