Riak and Distributed Image Processing

Nate Lawson nate at root.org
Mon Nov 7 16:57:31 EST 2011

On Nov 7, 2011, at 1:23 PM, andrew cooke wrote:

> Apologies if this is a dumb idea, or I am asking in the wrong place.  I'm
> muddling around trying to understand various bits of technology while piecing
> together a possible project.  So feel free to tell me I'm wrong :o)
> I am considering how best to design a system that processes data from
> telescopes.  A typical "step" in the processing might involve combining a
> small number of calibration images with a (possibly large) set of observation
> images in some way and then adding the result.  To do this in a distributed
> manner you would have the observations on various machines, broadcast the
> calibrations, then do a map (the per-observation processing) followed by a
> reduce (the summing).
> So, in very vague terms, this fits roughly into map-reduce territory.  What I
> am doing now is seeing how the details work out with various "nosql" systems.
> So my basic question is: how would the above fit with Riak?  Alternatively,
> what else should I consider?

Riak isn't good for computationally-expensive, long-running processing. You generally want to avoid M-R queries that take more than a couple seconds. I think your image processing job is better handled by something like Hadoop.

I think of Riak as a standard K-V store with the ability to customize query results with additional Erlang processing. It's almost like M-R is a misnomer for this feature.


More information about the riak-users mailing list