Tune Riak for fast inserts - populate DB

Russell Brown russell.brown at me.com
Wed Feb 13 04:52:05 EST 2013


On 13 Feb 2013, at 09:44, Bogdan Flueras <flueras.bogdan at gmail.com> wrote:

> Ok, so I've done something like this:
> Bucket bucket = client.createBucket("foo"); // lastWriteWins(true) doesn't work for Protobuf
> 
> when I insert I have:
> bucket.store(someKey, someValue).withoutFetch().pw(1).execute();
> 
> It looks like it's 20% faster than before. Is there something I could further tweak ?

The .pw(1) is redundant. All riak inserts are pw(1) by default. Setting last write wins will get you a speed gain, since it means the riak node will not attempt a read before it writes your data. Maybe use the HTTP client to set this property?

How many threads are you using? How are you getting the data to be written to the writing processes?

Cheers

Russell

> 
> ing. Bogdan Flueras
> 
> 
> 
> On Wed, Feb 13, 2013 at 10:19 AM, Bogdan Flueras <flueras.bogdan at gmail.com> wrote:
> Each thread has it's own bucket instance (pointing to the same location) and I don't re-fetch the bucket per insert.
> Thank you very much!
> 
> ing. Bogdan Flueras
> 
> 
> 
> On Wed, Feb 13, 2013 at 10:14 AM, Russell Brown <russell.brown at me.com> wrote:
> 
> On 13 Feb 2013, at 08:07, Bogdan Flueras <flueras.bogdan at gmail.com> wrote:
> 
> > How to set the bucket to last write? Is it in the builder?
> 
> Something like:
> 
>     Bucket b =   client.createBucket("my_bucket").lastWriteWins(true);
> 
> Also, after you've created the bucket, do you use it from all threads? You don't re-fetch the bucket per-insert operation, do you?
> 
> But  the "withoutFecth()" option is probably going to be the biggest performance increase, and safe if you are only doing inserts.
> 
> Cheers
> 
> Russell
> 
> > I'll have a look..
> > Yes, I use more threads and the bucket is configured to spread the load across all nodes.
> >
> > Thanks, I'll have a deeper look into the API and let you know about my results.
> >
> > ing. Bogdan Flueras
> >
> >
> >
> > On Wed, Feb 13, 2013 at 10:02 AM, Russell Brown <russell.brown at me.com> wrote:
> > Hi,
> >
> > On 13 Feb 2013, at 07:37, Bogdan Flueras <flueras.bogdan at gmail.com> wrote:
> >
> > > Hello all,
> > > I've got a 5 node cluster with Riak 1.2.1, all machines are multicore,
> > > with min 4GB RAM.
> > >
> > > I want to insert something like 50 million records in Riak with the java client (Protobuf used) with default settings.  I've tried also with HTTP protocol and set w = 1 but got some problems.
> > >
> > > However the process is very slow: it doesn't write more than 6GB/ hour or aprox. 280 KB/second.
> > > To have all my data filled in, it would take aprox 2 days !!
> > >
> > > What can I do to have the data filled into Riak ASAP?
> > > How should I configure the cluster ? (vm.args/ app.config) I don't care so much about consistency at this point.
> >
> > If you are certain to be only inserting new data setting your bucket(s) to last write wins will speed things up. Also, are you using multiple threads for the Java client insert? Spreading the load across all five nodes? Are you using the "withoutFetch()" option on the java client?
> >
> > Cheers
> >
> > Russell
> >
> > >
> > > Thank you,
> > > ing. Bogdan Flueras
> > >
> > > _______________________________________________
> > > riak-users mailing list
> > > riak-users at lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> 
> 
> 





More information about the riak-users mailing list