Correlating log messages with syslog-ng

January 26, 2011

This article was contributed by Robert Fekete

Correlating log messages to get a deeper insight about the actual events happening on a network or server is an important element of IT security. Being able to do so is mandated by several security compliance standards, best practices, and also common sense. However, many common log analyzing and correlation engines cannot handle high message rates in real time, requiring administrators to filter the input of the analyzing engine. Proprietary solutions are often licensed based on the number of processed messages, which limits their usefulness. The syslog-ng project aims to provide a flexible, real-time correlation solution that scales well even to extreme performance requirements.

Syslog-ng is an advanced system logging tool, which can be a replacement for the standard syslogd and rsyslog daemons. The syslog-ng pattern database, introduced almost two years ago, allows for real-time message identification and classification by comparing the incoming log messages to a set of message patterns. The classification engine of syslog-ng is much faster and scalable than using regular expressions to identify messages, and also permits the administrator to extract relevant information from the message body or to add custom metadata (for example, tags) to log messages. We looked at message classification in syslog-ng just over a year ago.

The new message correlation feature extends the syslog-ng pattern database to make it possible to associate related log messages, and to treat the information from those messages as if they were a single event.

Message correlation is one of the foundations of log analysis and reporting, because log messages tend to be hectic, and often separate important information about events into different log messages. For example, the Postfix e-mail server logs the sender and recipient addresses into separate log messages. For OpenSSH, if there is an unsuccessful login attempt, the server sends a log message about the authentication failure with the reason for the failure in the next message. But in fact the event and its exact details are interesting, not necessarily the individual log messages, therefore being able to collect information as events rather than messages can be a boon for every system administrator.

How correlation works in syslog-ng

Message correlation in syslog-ng operates on the log messages successfully identified by the syslog-ng's pattern database: you can extend the rules describing message patterns with instructions on how to correlate the matching messages.

Correlating log messages involves collecting the messages into message groups called contexts. A context consists of a series of log messages that are related to each other in some way, for example, the log messages of an SSH session can belong to the same context. Messages may be added to a context as they are processed. The context of a log message can be specified using simple static strings or with macros and dynamic values. For example, you can group messages received from the same host ($HOST), application ($HOST$PROGRAM), or process ($HOST$PROGRAM$PID).

Messages belonging to the same context are correlated, and can be processed in a number of ways. It is possible to include the information contained in an earlier message of the context in messages that are added later. For example, if a mail server application sends separate log messages about every recipient of an e-mail (like Postfix), you can merge the recipient addresses to the previous log message. Another option is to generate a completely new log message that contains all the important information that was stored previously in the context, for example, the login and logout (or timeout) times of an authenticated session (like SSH or telnet), and so on.

To ensure that a context handles only log messages of related events, a timeout value can be assigned to a context, which determines how long the context accepts related messages. If the timeout expires, the context is closed.

Triggering new messages and external actions

In syslog-ng Open Source Edition (OSE) 3.2, you can automatically generate new messages when a particular message is recognized, or the correlation timeout of a context expires. The generated messages can be configured within the pattern database rules, meaning that if needed, a new message can be generated for every incoming log message. Obviously this not necessary, unless you take log normalization really seriously.

When used together with message correlation, you can also refer to fields and values of earlier messages of the context. For example, the patterns:


        pam_unix(sshd:session): session closed for user @ESTRING:SSH_USERNAME: @
could be used to match OpenSSH's log messages. Then the action:
    <value name="MESSAGE">
        An SSH session for $SSH_USERNAME from ${SSH_CLIENT_ADDRESS}@1 \
        closed. Session lasted from ${DATE}@1 to $DATE.
would put out a correlated message that included information from both log messages. The above is just a snippet, consult the full XML rules for all the gory details.

Sending alerts directly from syslog-ng is currently not supported, but would be a welcome addition to the next versions. However, it is reasonably simple to pass the selected messages to an external script that sends out alerts in e-mail or SNMP. And since completely new messages can be created from the information extracted from the correlated messages, all the script has to do is to send out the alerts, for example using sendmail or snmptrap.

To process already collected log messages, syslog-ng also allows for correlating log messages from log files. For this reason, the time elapsed between two log messages is calculated from the actual timestamps of the log messages instead of using the system time.

Beyond syslog-ng 3.2

Work on syslog-ng OSE 3.3 has already started, and focuses on improving the support for multicore and multithreaded operations to increase the performance of syslog-ng and make it even more suitable for high-message rate environments. Transforming the internal representation of log messages to other, non-syslog outputs like JSON or WELF is also on the roadmap.

As correlating log messages becomes increasingly important for companies and organizations, it is welcome to see that open source tools are also focusing on solving this problem. Although the syslog-ng project has had a sometimes rocky relationship with the open source community in the past, its OSE is under active development. In fact, the message correlation feature, among others, is currently available only in the OSE.

[ The author is a technical writer for BalaBit, which developed syslog-ng. ]

Comments (4 posted)

Brief items

Quotes of the week

As more groups warm to the beauty that is embodied in Qt, I hope that the message of working together (rather than dictating, for life or otherwise) also spreads. That mode of operation is what got Qt and KDE Platform, as high quality developer tools, to where they are today. It is what motivates us to look at the development platforms we build for application developers and ask ourselves, "How can we make this as painless as possible for the developer while giving them access to as many platforms as seamlessly as possible?" It's a way of thinking that helps create a superior result, and we're always looking for new ways to expand the benefits it brings.
-- Aaron Seigo

...but this caught my eye.
Care to guess what that does?
-- "adamcecc"

Comments (2 posted)

KDE 4.6 released

KDE.News announces the release of KDE 4.6, including KDE Plasma Workspaces, updated KDE applications, and the mobile platform.

Comments (26 posted)

LibreOffice 3.3 released

The Document Foundation has announced the release of LibreOffice 3.3, which is the first stable release of the OpenOffice.org fork. "LibreOffice 3.3 brings several unique new features. The 10 most-popular among community members are, in no particular order: the ability to import and work with SVG files; an easy way to format title pages and their numbering in Writer; a more-helpful Navigator Tool for Writer; improved ergonomics in Calc for sheet and cell management; and Microsoft Works and Lotus Word Pro document import filters. In addition, many great extensions are now bundled, providing PDF import, a slide-show presenter console, a much improved report builder, and more besides. A more-complete and detailed list of all the new features offered by LibreOffice 3.3 is viewable on the following web page: http://www.libreoffice.org/download/new-features-and-fixes/".

Comments (11 posted)

OpenOffice.org 3.3.0 final released (The H)

The H looks at the OpenOffice.org 3.3 release. "OpenOffice.org 3.3.0 features an updated, easier to use, Extension Manager user interface (UI) and several improvements to Calc spreadsheets, such as an increase in the number of rows supported from 65,536 to 1,048,576. The print system has been restructured, the thesaurus dialogue has been redesigned for better usability and slide layout handling has been improved in the presentation application, Impress." More information can be found in the OOo New Features page and the release notes.

Comments (16 posted)

OpenSSH 5.7 released

OpenSSH 5.7 has been released. Some new features in this release include Elliptic Curve Cryptography modes for key exchange (ECDH) and host/user keys (ECDSA), a protocol extension to support a hard link operation added to sftp, new options for scp and ssh, and more. The announcement (click below) contains additional information.

Full Story (comments: 15)

Sala 1.0 released

Sala is a command-line tool for the management of an encrypted password database. Actual passwords are stored in their own file, making the use of tab-completion for lookups possible. The 1.0 release is available now.

Full Story (comments: 1)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Will it Blend? A Look at Blender's New User Interface (Linux.com)

Nathan Willis looks at Blender's new UI over at Linux.com. "And as with every new Blender release, there are indeed new tools in 2.56a. For example, the Solidify tool allows you to select a thin, planar object and automatically extrude thickness into it. There is a new paintbrush system, which lets you modify any brush's size, strength, texture, and low-level behavior curves. Sculpt Mode, in which you modify objects by whittling or squishing them around, was also rewritten, making it easier to do multi-resolution sculpting (for example, sculpting at a rough resolution to define a character's body, but working with much finer detail on its face)."

Comments (none posted)

Barnes: Debugging display problems

On his blog, Jesse Barnes has a nice description of how computer displays work in terms of the memory organization and timings, along with some tips on debugging display problems (with photos and links to videos). "There are several variables that apply: bits per pixel, indexed or not, tiling format, and color format (in the Intel case, RGB or YUV), and stride or pitch. Bits per pixel is as simple as it sounds, it simply defines how large each pixel is in bits. Indexed planes, rather than encoding the color directly in the bits for the pixel, use the value as an index into a palette table which contains a value for the color to be displayed. The tiling mode indicates the surface organization of the plane. Tiled surfaces allow for much more efficient rendering, and allowing planes to use them directly can save copies from tiled rendering targets to an un-tiled display plane. Finally, the color format defines what values the pixels represent."

Comments (1 posted)

