Multiple keys to fetch in a Java client

Russell Brown russelldb at basho.com
Tue Nov 29 08:05:24 EST 2011


Hi Suresh,


On 29 Nov 2011, at 12:10, suresh chandran wrote:

> Thanks Russel for the reply .
> 
> After going through all the APIs I too reached the same opinion. 1)Simultaneous multiple fetch and 2)MapReduce. However, simultaneous access would make the fetch very slower and it doesnt look efficient ( since there are threads spawned for each key and fetching say, 10K records might be an over head) let me know if I miss something here and that I shoudl not be concerned. My worry is not the procedure to spawn so many worker threads, but the number of connections they open and close and the performance overhead, thereby

Right, 10k is probably too many keys to fetch in parallel. Though you wouldn't open and close 10k connections since there is a connection pool, and you can set an upper bound on it when you configure your client (more on that below.) But you could easily use a producer/consumer set up to fetch many objects in parallel. It depends on what you are fetching the data for, how often, and what you're going to do with it.

> 
> I tried the map reduce in similar way that you mentioned. For weird reasons, it throws Java Heap memory exception and dies.
> MapReduceResult result =
>  client.mapReduce()
>         .addInput("goog","2010-01-04")
>         .addInput("goog","2010-01-05")
>         .addInput("goog","2010-01-06")
>         .addInput("goog","2010-01-07")
>         .addInput("goog","2010-01-08")
>         .addMapPhase(new NamedJSFunction("Riak.mapValuesJson"), true)
>     .execute();
> 
>   
> I used the same format as above mentioned in https://github.com/basho/riak-java-client page. I tried with and without the addMapPhase method, but in vain. Am I missing something here? Is there a prerequisite to use this code?

Curious. The code above is part of the integration test suite, it is run every time we build and doesn't usually throw an OOM exception. Did you call *addInput* 10k time in a loop before executing the map/reduce (can I see a snippet of code maybe, in a gist or pastebin, ideally)? Can I have a stack trace (a gist or pastebin is best for this also, please)? What version of the client did you use?

> 
> I also see that the above methods works only for IRiakClient. Until now I have been using RiakClient.

Which one? Which version etc? There are two classes called RiakClient (both legacy classes (one PB, one HTTP)), you can use them, but they are likely going to deprecated and removed in future versions. I recommend IRiakClient interface.

> Only to access these methods. I am using IRiakClient. Does that have any significance? way to access/ configure IRiakClient/ Mapreduce.

Like a lot of Java libraries you configure/acquire a client through a factory. There is an example in the README on the repo home page (https://github.com/basho/riak-java-client) under the heading *Configuration*, it is lower down that page. Maybe the README needs re-organising a bit.

> This is the only place I need the map reduce, 
> and throughout my code, i just store and fetch key values.  
> 
> I dont seem to get the answers in the forums or documents. If there are any,can you please point? Thanks again.

The README, the integration tests, and this mailing list are the current best sources of help. I would like to spend some time writing some more detailed documentation and tutorials (ha, actually, _no_, I wouldn't, but I really *should*!)

Hopefully that helps, and if you get me that stack trace I'll try and figure out what is going on with the map/reduce error.

Cheers

Russell

> 
> Suresh C Nair  
> 
> From: Russell Brown <russell.brown at me.com>
> To: suresh chandran <sureshcnair at yahoo.com> 
> Cc: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
> Sent: Monday, November 28, 2011 7:14 PM
> Subject: Re: Multiple keys to fetch in a Java client
> 
> 
> On 28 Nov 2011, at 20:16, suresh chandran wrote:
> 
>> May be I can make my statement clear :) I store my devices values in buckets named after OS (say windows/ linux) with the device id's as keys. I want to get  the devices  for a list of device ids . From what I see, the fetch API signature takes in bucket name and key. (Client.fetch("windows", "D1"). Is there a way where I can get Client.fetch("windows", <Collection fo keys>?
> 
> There is no multi-fetch from Riak right now. You can either fetch your individual keys (serially on in parallel) or run a Map/Reduce. In most case the Map/Reduce will be slower (though the API is simpler, I suppose.)
> 
> Use IRiakClient.mapReduce().addInput(bucket, key).addInput(bucket, key2)…(repeat!) I guess this is ripe for a convenience method that takes (as you suggest) addInputs(bucket, Collection<String> keys), so I'll add that to the features list.
> 
> Or just fetch all your keys using fetch. 
> 
> I guess it would be possible to add a bulk fetch API to the Riak Java Client that spawns a configurable number of threads to fetch a bunch of keys in parallel, but right now that is left as an exercise for the reader :)
> 
> Cheers
> 
> Russell
> 
> 
>> 
>> Thanks
>> Suresh C Nair
>> From: suresh chandran <sureshcnair at yahoo.com>
>> To: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
>> Sent: Monday, November 28, 2011 2:53 PM
>> Subject: Multiple keys to fetch in a Java client
>> 
>> HI,
>> 
>> I am storing values of a devices under a particular OS. I want to fetch the list of all vallues, based on the key list. What I see if that, I can fetch the value one by one using the FETCH. Am using the HttpClient in Java. How do I go about this? Assume I am tryign to reach the same behavior like query that has "CONTAINS(key1, key2...). Is there a way to do this? or we can fetch the values only one by one. I do see that Map-reduce, makes use of the list fo keys, but am not sure if it is applicable in my case, as it ia simple fetch with collection of / list of keys.
>> 
>> Thanks
>> Suresh C Nair
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20111129/5cdc54ac/attachment.html>


More information about the riak-users mailing list