Multiple keys to fetch in a Java client

Russell Brown russelldb at basho.com
Tue Nov 29 09:56:37 EST 2011


Oh, you replied already.

Please can you use a gist or pastebin for larger blocks of code? They're much easier to read.

On 29 Nov 2011, at 14:49, suresh chandran wrote:

> Hi Russel,
> 
> After your mail, I altered the code as
> 
>  public static boolean fetchAll(String bucket, Collection<String> keys,
> StringBuilder response)
> {
> Iterator<String> valuesIter = keys.iterator();
> try
> {
> IRiakClient iriakClient = RiakFactory.httpClient("http://127.0.0.1:8091/riak";);
> BucketKeyMapReduce reduce = iriakClient.mapReduce();
> while (valuesIter.hasNext())
> {
> String value = valuesIter.next();
> reduce.addInput(bucket, value);
> }
> reduce.addMapPhase(new NamedJSFunction("Riak.mapValuesJson"));
> MapReduceResult result =  reduce.execute();
> }
> ...
> }
> 
> But I get the below exception.
>  
> com.basho.riak.client.RiakException: java.io.IOException: {"lineno":477,"message":"JSON.parse","source":"unknown"}

Looks like Riak's mapValuesJson js function is throwing an error parsing your data.

> [INFO]    [11/29/11 9:42 AM] [akka:event-driven:dispatcher:global-9] [GetMaster] Error getting data in TEST  // ****** TEST is the bucket name that I use.*******
> at com.basho.riak.client.query.MapReduce.execute(MapReduce.java:81)
> at com.persistence.RiakConnector.fetchAll(RiakConnector.java:125)// ******* line number points to reduce.execute() method *********
> 
> Thanks
> Suresh C Nair
> 
> ________________________________
> From: suresh chandran <sureshcnair at yahoo.com>
> To: Russell Brown <russelldb at basho.com> 
> Cc: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
> Sent: Tuesday, November 29, 2011 9:38 AM
> Subject: Re: Multiple keys to fetch in a Java client
> 
> 
> Hi Russel,
> 
> I am using a static method to get the values, which is like
> public static boolean fetchAll(String bucket, Collection<String> keys,
> StringBuilder response)
> {
> PBClientConfig conf = new PBClientConfig.Builder()
>         .withHost("127.0.0.1")
>          .withPort(8091)
>        .build();
> try
> {
> 
> IRiakClient iriakClient = RiakFactory.newClient(conf);
> BucketKeyMapReduce reduce = iriakClient.mapReduce();
> while (valuesIter.hasNext())
> {
> String value = valuesIter.next();
> reduce.addInput(bucket, value);
> }
> MapReduceResult result = reduce.execute();
> .... // code to send the result to caller 
> }....
> I am sending only 6 strings in the collection. Whether I pass the name space or not, i get the below exception.
> 
> java.lang.OutOfMemoryError: Java heap space
> at com.basho.riak.pbc.RiakConnection.receive(RiakConnection.java:82)
> at com.basho.riak.pbc.MapReduceResponseSource.get_next_response(MapReduceResponseSource.java:86)
> at com.basho.riak.pbc.MapReduceResponseSource.<init>(MapReduceResponseSource.java:47)
> at com.basho.riak.pbc.RiakClient.mapReduce(RiakClient.java:588)
> at com.basho.riak.pbc.RiakClient.mapReduce(RiakClient.java:572)
> at com.basho.riak.client.raw.pbc.PBClientAdapter.mapReduce(PBClientAdapter.java:413)
> at com.basho.riak.client.query.MapReduce.execute(MapReduce.java:77)
> 
> I am using the Http RiakClient and assumed that is to be used. I shall use the IRiakClient here on. the IRiakClient I use is from the path basho-riak-java-client-5f8359c/target/riak-client-1.0.2-SNAPSHOT.jar . javap shows the versions as 
> 
> public interface com.basho.riak.client.IRiakClient
>   SourceFile: "IRiakClient.java"
>   minor version: 0
>   major version: 50
> 
> public class com.basho.riak.client.RiakFactory extends java.lang.Object
> 
>   SourceFile: "RiakFactory.java"
>   minor version: 0
>   major version: 50
> 
> Let me know if you need anything in specific.
> 
> Thanks
> Suresh C Nair
> 
> ________________________________
> From: Russell Brown <russelldb at basho.com>
> To: suresh chandran <sureshcnair at yahoo.com> 
> Cc: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
> Sent: Tuesday, November 29, 2011 8:05 AM
> Subject: Re: Multiple keys to fetch in a Java client
> 
> 
> Hi Suresh,
> 
> 
> 
> On 29 Nov 2011, at 12:10, suresh chandran wrote:
> 
> Thanks Russel for the reply .
>> 
>> 
>> After going through all the APIs I too reached the same opinion. 1)Simultaneous multiple fetch and 2)MapReduce. However, simultaneous access would make the fetch very slower and it doesnt look efficient ( since there are threads spawned for each key and fetching say, 10K records might be an over head) let me know if I miss something here and that I shoudl not be concerned. My worry is not the procedure to spawn so many worker threads, but the number of connections they open and close and the performance overhead, thereby
>> 
> 
> Right, 10k is probably too many keys to fetch in parallel. Though you wouldn't open and close 10k connections since there is a connection pool, and you can set an upper bound on it when you configure your client (more on that below.) But you could easily use a producer/consumer set up to fetch many objects in parallel. It depends on what you are fetching the data for, how often, and what you're going to do with it.
> 
> 
>> 
>> I tried the map reduce in similar way that you mentioned. For weird reasons, it throws Java Heap memory exception and dies.
>> MapReduceResult result = client.mapReduce() .addInput("goog","2010-01-04") .addInput("goog","2010-01-05") .addInput("goog","2010-01-06") .addInput("goog","2010-01-07") .addInput("goog","2010-01-08") .addMapPhase(new NamedJSFunction("Riak.mapValuesJson"), true) .execute();
>> 
>> 
>> I used the same format as above mentioned in https://github.com/basho/riak-java-client page. I tried with and without the addMapPhase method, but in vain. Am I missing something here? Is there a prerequisite to use this code?
>> 
> 
> Curious. The code above is part of the integration test suite, it is run every time we build and doesn't usually throw an OOM exception. Did you call *addInput* 10k time in a loop before executing the map/reduce (can I see a snippet of code maybe, in a gist or pastebin, ideally)? Can I have a stack trace (a gist or pastebin is best for this also, please)? What version of the client did you use?
> 
> 
>> I also see that the
> above methods works only for IRiakClient. Until now I have been using RiakClient.
> 
> Which one? Which version etc? There are two classes called RiakClient (both legacy classes (one PB, one HTTP)), you can use them, but they are likely going to deprecated and removed in future versions. I recommend IRiakClient interface.
> 
> Only to access these methods. I am using IRiakClient. Does that have any significance? way to access/ configure IRiakClient/ Mapreduce. 
> 
> Like a lot of Java libraries you configure/acquire a client through a factory. There is an example in the README on the repo home page (https://github.com/basho/riak-java-client) under the heading *Configuration*, it is lower down that page. Maybe the README needs re-organising a bit.
> 
> This is the only place I need the map reduce, 
>> and throughout my code, i just store and fetch key values.  
> 
>> 
>> I dont seem to get the answers in the forums or documents. If there are any,can you please point? Thanks again.
> 
> The README, the integration tests, and this mailing list are the current best sources of help. I would like to spend some time writing some more detailed documentation and tutorials (ha, actually, _no_, I wouldn't, but I really *should*!)
> 
> Hopefully that helps, and if you get me that stack trace I'll try and figure out what is going on with the map/reduce error.
> 
> Cheers
> 
> Russell
> 
> 
>> 
>> Suresh C Nair  
>> 
>> 
>> 
>> ________________________________
>> From: Russell Brown <russell.brown at me.com>
>> To: suresh chandran <sureshcnair at yahoo.com> 
>> Cc: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
>> Sent: Monday, November 28, 2011 7:14 PM
>> Subject: Re: Multiple keys to fetch in a Java client
>> 
>> 
>> 
>> 
>> On 28 Nov 2011, at 20:16, suresh chandran wrote:
>> 
>> May be I can make my statement clear :) I store my devices values in buckets named after OS (say windows/ linux) with the device id's as keys. I want to get  the devices  for a list of device ids . From what I see, the fetch API signature takes in bucket name and key. (Client.fetch("windows", "D1"). Is there a way where I can get Client.fetch("windows", <Collection fo keys>?
>>> 
>> 
>> 
>> There is no multi-fetch from Riak right now. You can either fetch your individual keys (serially on in parallel) or run a Map/Reduce. In most case the Map/Reduce will be slower (though the API is simpler, I suppose.)
>> 
>> 
>> Use IRiakClient.mapReduce().addInput(bucket, key).addInput(bucket, key2)…(repeat!) I guess this is ripe for a convenience method that takes (as you suggest) addInputs(bucket, Collection<String> keys), so I'll add that to the features list.
>> 
>> 
>> Or just fetch all your keys using fetch. 
>> 
>> 
>> I guess it would be possible to add a bulk fetch API to the Riak Java Client that spawns a configurable number of threads to fetch a bunch of keys in parallel, but right now that is left as an exercise for the reader :)
>> 
>> 
>> Cheers
>> 
>> 
>> Russell
>> 
>> 
>> 
>> 
>>> Thanks
>>> Suresh C Nair
>>> 
>>> 
>>> ________________________________
>>> From: suresh chandran <sureshcnair at yahoo.com>
>>> To: "riak-users at lists.basho.com" <riak-users at lists.basho.com> 
>>> Sent: Monday, November 28, 2011 2:53 PM
>>> Subject: Multiple keys to fetch in a Java client
>>> 
>>> 
>>> HI,
>>> 
>>> 
>>> I am storing values of a devices under a particular OS. I want to fetch the list of all vallues, based on the key list. What I see if that, I can fetch the value one by one using the FETCH. Am using the HttpClient in Java. How do I go about this? Assume I am tryign to reach the same behavior like query that has "CONTAINS(key1, key2...). Is there a way to do this? or we can fetch the values only one by one. I do see that Map-reduce, makes use of the list fo keys, but am not sure if it is applicable in my case, as it ia simple fetch with collection of / list of keys.
>>> 
>>> 
>>> Thanks
>>> Suresh C Nair
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





More information about the riak-users mailing list