How to process batch of events in N seconds after latest
ivanov.maxim at gmail.com
Tue May 15 08:56:10 EDT 2012
what's the best approach to process batch of events in N seconds after
latest event in a group happen? Events are grouped by key.
I am thinking about following scheme:
1) events are recorded in a way that every write creates new sibling
to avoid read/write multiple cycles per event
2) with every write new secondary index is created with value =
"sweep_at_$current_time + N"
3) every second process queries Riak for secondary keys with values <=
4) for every item returned it queries all it's siblings:
- if there are siblings, then merge them into 1 record, calculate and
write new secondary index "seep_at_$latest_sibling_time + N". Go to
next substep if newly calculated timeout value is <= current time.
- if there are no siblings, process them and remove key from Riak
Therefore for every batch of N events on average (given that 99% of
event batches timespans are less than N) there will be:
N+1 writes and 2 secondary index seek and 2 reads
Is it correct approach for Riak? It could be improved further by
carefully setting secondary index on stage 2 so that merge of all
sibling will be immediately followed by processing of events batch,
but right now I am more intrested wether it fit nicely to Riak.
More information about the riak-users