Data modelling questions

Christopher Meiklejohn cmeiklejohn at basho.com
Sat Feb 21 11:44:14 EST 2015


> On Feb 20, 2015, at 5:35 PM, AM <ams.fwd at gmail.com> wrote:
> 
> Hi All.
> 
> I am currently looking at using Riak as a data store for time series data. Currently we get about 1.5T of data in JSON format that I intend to persist in Riak. I am having some difficulty figuring out how to model it such that I can fulfill the use cases I have been handed.
> 
> The data is provided in several types of log formats with some common fields:
> 
> - timestamp
> - geo
> - s/w build #
> - location #
> 
> - .... whole bunch of other key value pairs.
> 
> For the most part I will need to provide aggregated views based on geo. There are some views based on s/w build # and location #. The aggregation will be on an hourly basis.
> 
> The model that I came up with:
> 
> <log-format-type>[<hour>][<timestamp>-<msg-id>]: <json-body>

Hi AM, 

Additionally, it would be great if you could provide additional information on how you plan on querying both the original and aggregated values.  Querying is usually the most difficult part to get right in Riak, and your query pattern will be very important in establishing the best way to lay out this data on disk.

- Chris

Christopher Meiklejohn
Senior Software Engineer
Basho Technologies, Inc.
cmeiklejohn at basho.com





More information about the riak-users mailing list