diff --git a/plugins/inputs/logparser/README.md b/plugins/inputs/logparser/README.md index 5973d9f42..177d77a98 100644 --- a/plugins/inputs/logparser/README.md +++ b/plugins/inputs/logparser/README.md @@ -1,6 +1,6 @@ -# logparser Input Plugin +# Logparser Input Plugin -The logparser plugin streams and parses the given logfiles. Currently it only +The `logparser` plugin streams and parses the given logfiles. Currently it has the capability of parsing "grok" patterns from logfiles, which also supports regex patterns. @@ -37,35 +37,28 @@ regex patterns. ''' ``` -## Grok Parser - -The grok parser uses a slightly modified version of logstash "grok" patterns, -with the format - -``` -%{[:][:]} -``` - -Telegraf has many of it's own -[built-in patterns](https://github.com/influxdata/telegraf/blob/master/plugins/inputs/logparser/grok/patterns/influx-patterns), -as well as supporting -[logstash's builtin patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns). - +### Grok Parser The best way to get acquainted with grok patterns is to read the logstash docs, which are available here: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html +The Telegraf grok parser uses a slightly modified version of logstash "grok" +patterns, with the format -If you need help building patterns to match your logs, -you will find the http://grokdebug.herokuapp.com application quite useful! +``` +%{[:][:]} +``` +The `capture_syntax` defines the grok pattern that's used to parse the input +line and the `semantic_name` is used to name the field or tag. The extension +`modifier` controls the data type that the parsed item is converted to or +other special handling. By default all named captures are converted into string fields. -Modifiers can be used to convert captures to other types or tags. Timestamp modifiers can be used to convert captures to the timestamp of the - parsed metric. - +parsed metric. If no timestamp is parsed the metric will be created using the +current time. - Available modifiers: - string (default if nothing is specified) @@ -91,7 +84,112 @@ Timestamp modifiers can be used to convert captures to the timestamp of the - ts-epochnano (nanoseconds since unix epoch) - ts-"CUSTOM" - CUSTOM time layouts must be within quotes and be the representation of the "reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006` See https://golang.org/pkg/time/#Parse for more details. + +Telegraf has many of its own +[built-in patterns](./grok/patterns/influx-patterns), +as well as supporting +[logstash's builtin patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns). + +If you need help building patterns to match your logs, +you will find the https://grokdebug.herokuapp.com application quite useful! + +#### Timestamp Examples + +This example input and config parses a file using a custom timestamp conversion: + +``` +2017-02-21 13:10:34 value=42 +``` + +```toml +[[inputs.logparser]] + [inputs.logparser.grok] + patterns = ['%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"} value=%{NUMBER:value:int}'] +``` + +This example parses a file using a built-in conversion and a custom pattern: + +``` +Wed Apr 12 13:10:34 PST 2017 value=42 +``` + +```toml +[[inputs.logparser]] + [inputs.logparser.grok] + patterns = ["%{TS_UNIX:timestamp:ts-unix} value=%{NUMBER:value:int}"] + custom_patterns = ''' + TS_UNIX %{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND} %{TZ} %{YEAR} + ''' +``` + +#### TOML Escaping + +When saving patterns to the configuration file, keep in mind the different TOML +[string](https://github.com/toml-lang/toml#string) types and the escaping +rules for each. These escaping rules must be applied in addition to the +escaping required by the grok syntax. Using the Multi-line line literal +syntax with `'''` may be useful. + +The following config examples will parse this input file: + +``` +|42|\uD83D\uDC2F|'telegraf'| +``` + +Since `|` is a special character in the grok language, we must escape it to +get a literal `|`. With a basic TOML string, special characters such as +backslash must be escaped, requiring us to escape the backslash a second time. + +```toml +[[inputs.logparser]] + [inputs.logparser.grok] + patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"] + custom_patterns = "UNICODE_ESCAPE (?:\\\\u[0-9A-F]{4})+" +``` + +We cannot use a literal TOML string for the pattern, because we cannot match a +`'` within it. However, it works well for the custom pattern. +```toml +[[inputs.logparser]] + [inputs.logparser.grok] + patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"] + custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+' +``` + +A multi-line literal string allows us to encode the pattern: +```toml +[[inputs.logparser]] + [inputs.logparser.grok] + patterns = [''' + \|%{NUMBER:value:int}\|%{UNICODE_ESCAPE:escape}\|'%{WORD:name}'\| + '''] + custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+' +``` + +### Tips for creating patterns + +Writing complex patterns can be difficult, here is some advice for writing a +new pattern or testing a pattern developed [online](https://grokdebug.herokuapp.com). + +Create a file output that writes to stdout, and disable other outputs while +testing. This will allow you to see the captured metrics. Keep in mind that +the file output will only print once per `flush_interval`. + +```toml +[[outputs.file]] + files = ["stdout"] +``` + +- Start with a file containing only a single line of your input. +- Remove all but the first token or piece of the line. +- Add the section of your pattern to match this piece to your configuration file. +- Verify that the metric is parsed successfully by running Telegraf. +- If successful, add the next token, update the pattern and retest. +- Continue one token at a time until the entire line is successfully parsed. + +### Additional Resources + +- https://www.influxdata.com/telegraf-correlate-log-metrics-data-performance-bottlenecks/ diff --git a/plugins/inputs/logparser/grok/grok.go b/plugins/inputs/logparser/grok/grok.go index 7131b8249..f684e9339 100644 --- a/plugins/inputs/logparser/grok/grok.go +++ b/plugins/inputs/logparser/grok/grok.go @@ -168,6 +168,7 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) { } if len(values) == 0 { + log.Printf("D! Grok no match found for: %q", line) return nil, nil }