newbie data modeling questions

Davis Ford davisford at
Mon Apr 30 16:11:28 EDT 2012

I've been evaluating Riak for about a week now, just playing around with
it, writing some sample apps, and reading all the docs.  I've also read
Mathias Meyer's excellent - (highly recommended).
 I've been trying to devour all the information as fast as I can consume
it, but my brain is getting a little older, and sometime I need to read
things more than once :)

A friend & I manufacture small, low-power networked devices that capture
sensor data and report it to a server via HTTP.  I'm looking at using Riak
to store this data.  The data can grow infinitely, and the number of
devices may grow to become quite large.  Each device posts all of its
sensor readings in one shot periodically on a configurable interval, or
because of an individual sensor interrupt.  It looks a bit like:

{ friendly_name: "Basement Device",
  mac: "00:11:22:AA:BB:CC",
  v1: 13523,
  v2: 35235,
  timestamp: 1335816375022

where each v[x] represents the value of an individual sensor.  The data
never needs to be modified once written.  I need to query it by timestamp
range and per device (all readings for device id=foo and between timestamp
bar and baz).  I'd also like to support m/r jobs to gather interesting
stats on the sensor data.  Devices are also attached to users in the system
(hence each user will want to view their own device sensor data).

That's sort of how the data looks and how I want to interact with it.  This
is how I was thinking of applying it via Riak:

Leveldb as the storage option b/c it avoids the need to keep all keys in
memory (infinite growth) and supports 2i indexing.

Each device has its own bucket with the bucket name being the mac addr +
some guid (INCR via Redis).  E.g. 001122AABBCC-1 -- this allows
differentiation if a user changes the mac addr, posing a mac addr conflict.

Object keys will just be the timestamp -- avoids any conflict resolution
and provides time series data that is already sorted.

2i indexes on mac and friendly name

User objects also stored in Riak with a link to a list of bucket names
where they claim ownership to the device.

Does this data model seem sane for what I'm trying to do?  Any hesitations,

Is there a way to cap a bucket to a fixed size - in a LIFO manner, or does
the app have to do that?

Finally, I'm building the application in node.js and I've been playing with
the excellent riak-js client library.  I have some reservations, though.
 The protobuf support is still experimental and I saw frank06 indicate that
he probably won't have time to continue dev. b/c of other commitments.  Is
anyone else using riak-js in production -- anyone on the list thinking of
taking over dev. of a riak js client..or are there any other riak js
clients out there (riak-js was the only one I noticed).

Thanks in advance for any feedback.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list