http based queries

Kevin Smith ksmith at basho.com
Wed Feb 17 17:45:34 EST 2010


I think it can be. Right now you have complete access to Riak from within your M/R jobs. So you could run a daily M/R job to crunch thru the stats and have it store the result back inside Riak. The easiest place to do this would be in the final reduce phase of a job. If you don't care about receiving the actual output of the job you can also turn accumulation off for all steps which will save some memory and speed things along.

Accessing Riak from inside M/R jobs only works for functions written in Erlang. This is a limitation we're aware of and one I'm trying to fix in our Javascript support. I'd like to say that I'll have this done soon but there's a lot to do to make it work correctly so I can't offer an exact timeline except to say "when it's done".

--Kevin
On Feb 17, 2010, at 5:19 PM, Richard Bucker wrote:

> Kevin,
> 
> this was a good email with lot's of good information.... I have a use-case that I need to evaluate with Riak and I'm hoping you can offer me some insite.
> 
> My asterisk switches are generating 1M records a day. I need to aggregate (reduce) the data for reporting.  Is this going to be efficient so that I do not have to reread all 1M records for every day I need to report on?
> 
> /r
> 
> On Feb 17, 2010, at 10:34 AM, Richard Bucker wrote:
> 
> > Nice answer... but it begs a number of related questions:
> >
> >  - what is the actual syntax of the /mapred call
> 
> The JSON document is the syntax. It is fully documented in doc/js-mapreduce.org in the Riak code tree. I've also attached a plain ASCII version of the org-mode file, in case you don't have Emacs or org-mode configured. The name of the file is a little misleading as the JSON syntax works for M/R jobs written in both Erlang and Javascript. The basics of Riak's core map/reduce machinery -- with an Erlang flavor -- is documented doc/basic-mapreduce.txt, also in the code tree.
> 
> I've also recorded a short screencast illustrating how to use the new HTTP M/R interface and Javascript support. You can view the video here: http://vimeo.com/9188550. Some people have reported the video gets a little fuzzy towards the end, so apologies if that's the case.
> 
> >  - when do I get my results
> 
> You get your results when the job completes. However, the /mapred endpoint also knows how to stream job results. This is useful for reducing latency and memory consumption for jobs which return lots of data. You can enable job streaming by posting to the URL /mapred?chunked=true. The chunked query param tells the Riak HTTP interface to return results as multipart-encoded JSON.
> 
> >  - what happens when there are long running queries
> 
> The HTTP interface currently enforces a hard limit of 2 minutes for M/R queries. We are going to extend the JSON query format to allow users to specify their own timeout values per M/R job for next release.
> 
> >  - where do I specify indexes
> 
> M/R jobs only understand inputs of a single bucket name or a list of bucket/key pairs.
> 
> >  - looking at the gist you attached below; where is it documented? what APIs are documented?
> 
> See above.
> 
> --Kevin
> 
> 
> 
> >
> > /r
> >
> > Richard -
> >
> > You can do this using Riak's map/reduce feature.
> >
> > Assuming the documents were contained in a bucket named 'foo' you could craft a JSON document like this gist https://gist.github.com/434e7d1bb596bbc214e4. Posting the document to your Riak server's /mapred URL will run the query and return the results as JSON-encoded data.
> >
> > --Kevin
> > On Feb 17, 2010, at 9:31 AM, Richard Bucker wrote:
> >
> > > I must be missing something in the DOC.  Is there a way to GET documents from the DB using some search terms? For example I have some document: /raw/emails/<some uuid> and the contents might be {'created':'2010-01-01','subject':'hello', 'message':'hello world'}
> > >
> > > Using http how would I find all the documents in the 'emails' bucket that were created on Jan 1, 2010?
> > >
> > > /r _______________________________________________
> > > riak-users mailing list
> > > riak-users at lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> <js-mapreduce.txt>_______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




More information about the riak-users mailing list