Feedback for GSoC project - RIak Destination for Syslog-ng
algernon at madhouse-project.org
Wed May 6 08:59:23 EDT 2015
I'm the mentor for the Riak destination for syslog-ng project, please
allow me to answer the questions below:
>>>>> "Fred" == Fred Dushin <fdushin at basho.com> writes:
Fred> As far as I understand, you're talking about a mapping from keys to
Fred> sets, but I'm unclear on a few things.
The idea is to map a set of log messages to a Riak Set. Where both the
key used for the set, and its contents are configurable by the
user. There are no plans for a default at this time.
There are many ways to configure a syslog-ng=>Riak setup with a
destination like the one planned. One is to turn each log message (after
parsing) to a Riak Map, and push those maps into a Riak Set. Another way
is to format the parsed log messages (with all the extracted fiels, if
any) into JSON, and push those into a set.
So, for example, given the following syslog line:
May 6 14:42:18 eowyn avahi-daemon: Invalid response packet from host fe80::5d0f:d53a:7b6:3680.
We'd end up with a JSON like this:
"message": "Invalid response packet from host fe80::5d0f:d53a:7b6:3680.",
"message": "Invalid response packet",
We could either add that to a Riak set as-is, or turn it into a Riak map
Fred> What are the keys you are thinking about? Time stamps? If
Fred> timestamps, these are presumably the timestamps of the syslog
Whatever the user configures. They may be time stamps (rounded, for
predictable keys), or a combination of program name + current date (day
Fred> Just a word of warning, if so. You might find a lot of
Fred> variation in timestamp formats and granularity. Perhaps you
Fred> can get something reliable out of syslog-ng,
We get something sensible out of syslog-ng. But in the end, it is up to
the user to configure the template used for keys. There may - and
probably will - be examples, but no default.
Fred> but that won't help you in the case where syslog-ng is
Fred> functioning as a syslog relay, and you want to preserve the
Fred> timestamp of the originator, which you should, if you want to
Fred> preserve integrity of the logs (e.g, for compliance).
In case of syslog-ng, we actually have access to a few kinds of
timestamps: the timestamp from the log message (if any), the timestamp
of receipt, and the current time. The granularity of timestamps is
configurable to some extent.
Fred> Or are you talking about a key being a (course grained)
Fred> timestamp, say, an integral value in UTC seconds, for example?
Fred> And the value(s) being all logs in that interval? Is that your
Fred> motivation for sets?
That's one way, yes. One could also use something like
$PROGRAM/$YEAR-$MONTH-$DAY as key, if the program doesn't produce more
than a megabyte of logs a day. So with the example above, our key in
case of that log would be avahi-daemon/2015-05-06, and the message would
be an element of the set underneath the key.
Fred> How much of the syslog payload are you planning to parse?
The destination itself is not going to do any parsing. Other parts of
syslog-ng do that, and it is up to the user to set up a pipeline that
feeds the destination. The source may be syslog, HTTP logs, the Journal,
or any of the other sources syslog-ng supports. How much parsing is
done, and what gets extracted, is no concern to the destination plugin.
Fred> Another interesting problem is that the STRUCTURED-DATA element of
Fred> 5424 uses OIDs to discriminate different data types that are encoded
Fred> in the header. And while there is a kind of loosely coupled authority
Fred> for OIDs, there is no infrastructure for determining a parsing
Fred> strategy for these fields. They could really be anything, in the worst
As far as I remember, syslog-ng treats all STRUCTURED-DATA elements as
strings. But there are tools within syslog-ng to allow converting to
other data types, but that must be done explicitly.
Fred> But regardless of the deeply structured data, you could get some very
Fred> interesting traction by just taking standard headers and indexing them
Fred> through Yokozuna. Certainly, indexing the body of a syslog message is
Fred> a great idea, as these messages are generally unstructured and fodder
Fred> for lucene. This is something that Logstash/ElasticSearch can do
Fred> pretty effectively today, and it would be cool to see the same in Riak
Fred> + some syslog provider.
Yep! When I proposed the idea, using Yokozuna is something I had in
mind. Combine the parsing abilities of syslog-ng, Riak for archival
purposes, and Yokozuna for searching. That sounds like a match made in heaven.
Fred> Finally, it would be really nice if you could structure your plugin in
Fred> such a way that they could eventually be ported to rsyslog . The
Fred> rsyslogd daemon is deployed by default on certain Linux favors and
Fred> enjoys fairly widespread distribution. You might be able to get it
Fred> supported in that community, as well.
Part of the project is writing a small library to send data to Riak,
From C. Just enough for syslog-ng's needs. That library could be used by
rsyslog, too (like the MongoDB library originally written for
syslog-ng's purposes is used by rsyslog too). But sharing more code than
that is not practical, the two daemons work in widely different ways.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 818 bytes
Desc: not available
More information about the riak-users