High volume data series storage and queries

Paul O pcotec at gmail.com
Tue Aug 9 21:32:01 EDT 2011


Alexander, the whole batching strategy I described in my initial post is
trying to help the problem better map to a kv store such as Riak. The plan
is for each batch of MaxN events to be stored under a single key, hence
avoiding the problem of storing too few tiny values. I'm still surprised of
a 450 bytes per value overhead.

I do appreciate your reasoning around this, though.

In the end Ciprian's earlier suggestion (Riak core + custom storage) would
seem to win the day. Not sure if the tradeoff would pay off immediately,
though, so I might end up having some strategy for the initial volume
expectations with a migration plan to a more advanced solution sometimes
down the road, I guess.

Regards,

Paul

On Tue, Aug 9, 2011 at 10:43 AM, Alexander Sicular <siculars at gmail.com>wrote:

> A couple of thoughts:
>
> -disk io
> -total keys versus memory
> -data on disk overhead
>
> As Jeremiah noted, disk io is crucial. Thankfully, Riak's distributed mesh
> gives you access to a number of spindles limited only by your budget. I
> think that is a critical bonus of a distributed system like Riak that is
> often not fully appreciated. Here Riak is a win for you.
>
> Bitcask needs all keys to fit in memory. We are talking something like:
>
> (key length + overhead) * number of keys * replicas < cluster max available
> ram.
>
> There is a tool on the wiki which should help figure this out. What that
> basically means for you is that you will have to batch your data by some
> sensor/time granularity metric. Let's say every minute. At 10hz that is a
> 600x reduction in total keys. Of course, this doesn't come for free. Your
> application middleware will have to accommodate. That means you could lose
> up to whatever your time granularity batch is. Ie. You could lose a minute
> of sensor data should your application fail. Here Riak is neutral to
> negative.
>
> Riak data structure is not friendly towards small values. Sensor data
> generally spit out integers or other small data tuples. If you search the
> list archives you will find a magnificent data overhead writeup. IIRC, it
> was something on the order of 450b. What that basically tells you is that
> you can't use bitcask for small values if disk space is a concern, as I
> imagine it to be in this case. Also, sensor data is generally write only,
> ie. never deleted or modified, so compaction should not be a concern when
> using bitcask. Here Riak is a strong negative.
>
> Data retrieval issues aside (which between Riak Search/secondary
> indexes/third party indexes should not be a major concern), I am of the
> opinion that Riak is not a good fit for high frequency sensor data
> applications.
>
> Cheers,
> Alexander
>
> Sent from my rotary phone.
> On Aug 8, 2011 9:40 PM, "Paul O" <pcotec at gmail.com> wrote:
> > Quite a few interesting points, thanks!
> >
> > On Mon, Aug 8, 2011 at 5:53 PM, Jeremiah Peschka <
> jeremiah.peschka at gmail.com
> >> wrote:
> >
> >> Responses inline
> >>
> >> On Aug 8, 2011, at 1:25 PM, Paul O wrote:
> >>
> >> Will any existing data be imported? If this is totally greenfield, then
> >> you're free to do whatever zany things you want!
> >
> >
> > Almost totally greenfield, yes. Some data will need to be imported but
> it's
> > already in the format described.
> >
> > Ah, so you need IOPS throughput, not storage capacity. On the hardware
> side
> >> make sure your storage subsystem can keep up - don't cheap out on disks
> just
> >> because you have a lot of nodes. A single rotational HDD can only handle
> >> about 180 IOPS on average. There's a lot you can do on the storage
> backend
> >> to make sure you're able to keep up there.
> >>
> >
> > Indeed, storage capacity is also an issue but IOPS would be important,
> too.
> > I assume that sending batches to Riak (opaque blobs) would help a lot
> with
> > the quantity of writes, but it's still a very important point.
> >
> > You may want to look into ways to force Riak to clean up the bitcask
> files.
> >> I don't entirely remember how it's going to handle cleaning up deleted
> >> records, but you might run into some tricky situations where compactions
> >> aren't occurring.
> >>
> >
> > Hm, any references regarding that? It would be a major snag in the whole
> > schema Riak doesn't properly reclaim space for deleted records.
> >
> > Riak is pretty constant time for Bitcask. The tricky part with the amount
> of
> >> data you're describing is that Bitcask requires (I think) that all keys
> fit
> >> into memory. As your data volume increases, you'll need to do a
> combination
> >> of scaling up and scaling out. Scale up RAM in the nodes and then add
> >> additional nodes to handle load. RAM will help with data volume, more
> nodes
> >> will help with write throughput.
> >>
> >
> > Indeed, for high frequency sources that would create lots of bundles even
> > the MaxN to 1 reduction for key names might still generate loads of keys.
> > Any idea how much RAM Riak requires per record, or a reference that would
> > point me to it?
> >
> > Since you're searching on time series, mostly, you could build time
> indexes
> >> in your RDBMS. The nice thing is that querying temporal data is well
> >> documented in the relational world, especially in the data warehousing
> >> world. In your case, I'd create a dates table and have a foreign key
> >> relating to my RDBMS index table to make it easy to search for dates.
> >> Querying your time table will be fast which reduces the need for scans
> in
> >> your index table.
> >>
> >> EXAMPLE:
> >>
> >> CREATE TABLE timeseries (
> >> time_key INT,
> >> date TIMESTAMP,
> >> datestring VARCHAR(30),
> >> year SMALLINT,
> >> month TINYINT,
> >> day TINYINT,
> >> day_of_week TINYINT
> >> -- etc
> >> );
> >>
> >> CREATE TABLE riak_index (
> >> id INT NOT NULL,
> >> time_key INT NOT NULL REFERENCES timeseries(time_key),
> >> riak_key VARCHAR(100) NOT NULL
> >> );
> >>
> >>
> >> SELECT ri.riak_key
> >> FROM timeseries ts
> >> JOIN riak_index ri ON ts.time_key = ri.time_key
> >> WHERE ts.date BETWEEN '20090702' AND '20100702';
> >>
> >
> > My plan was to have the riak_index contain something like: (id,
> start_time,
> > end_time, source_id, record_count.)
> >
> > Without going too much into RDBMS fun, this pattern can get your RDBMS
> >> running pretty quickly and then you can combine that with Riak's
> performance
> >> and have a really good idea of how quick any query will be.
> >
> >
> > That's roughly the plan, thanks again for your contributions to the
> > discussion!
> >
> > Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110809/40532f85/attachment.html>


More information about the riak-users mailing list