Improve logparser README (#2664)
This commit is contained in:
parent
a12e082dbe
commit
b90a5b48a1
|
@ -1,6 +1,6 @@
|
||||||
# logparser Input Plugin
|
# Logparser Input Plugin
|
||||||
|
|
||||||
The logparser plugin streams and parses the given logfiles. Currently it only
|
The `logparser` plugin streams and parses the given logfiles. Currently it
|
||||||
has the capability of parsing "grok" patterns from logfiles, which also supports
|
has the capability of parsing "grok" patterns from logfiles, which also supports
|
||||||
regex patterns.
|
regex patterns.
|
||||||
|
|
||||||
|
@ -37,35 +37,28 @@ regex patterns.
|
||||||
'''
|
'''
|
||||||
```
|
```
|
||||||
|
|
||||||
## Grok Parser
|
### Grok Parser
|
||||||
|
|
||||||
The grok parser uses a slightly modified version of logstash "grok" patterns,
|
|
||||||
with the format
|
|
||||||
|
|
||||||
```
|
|
||||||
%{<capture_syntax>[:<semantic_name>][:<modifier>]}
|
|
||||||
```
|
|
||||||
|
|
||||||
Telegraf has many of it's own
|
|
||||||
[built-in patterns](https://github.com/influxdata/telegraf/blob/master/plugins/inputs/logparser/grok/patterns/influx-patterns),
|
|
||||||
as well as supporting
|
|
||||||
[logstash's builtin patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns).
|
|
||||||
|
|
||||||
|
|
||||||
The best way to get acquainted with grok patterns is to read the logstash docs,
|
The best way to get acquainted with grok patterns is to read the logstash docs,
|
||||||
which are available here:
|
which are available here:
|
||||||
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
|
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
|
||||||
|
|
||||||
|
The Telegraf grok parser uses a slightly modified version of logstash "grok"
|
||||||
|
patterns, with the format
|
||||||
|
|
||||||
If you need help building patterns to match your logs,
|
```
|
||||||
you will find the http://grokdebug.herokuapp.com application quite useful!
|
%{<capture_syntax>[:<semantic_name>][:<modifier>]}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `capture_syntax` defines the grok pattern that's used to parse the input
|
||||||
|
line and the `semantic_name` is used to name the field or tag. The extension
|
||||||
|
`modifier` controls the data type that the parsed item is converted to or
|
||||||
|
other special handling.
|
||||||
|
|
||||||
By default all named captures are converted into string fields.
|
By default all named captures are converted into string fields.
|
||||||
Modifiers can be used to convert captures to other types or tags.
|
|
||||||
Timestamp modifiers can be used to convert captures to the timestamp of the
|
Timestamp modifiers can be used to convert captures to the timestamp of the
|
||||||
parsed metric.
|
parsed metric. If no timestamp is parsed the metric will be created using the
|
||||||
|
current time.
|
||||||
|
|
||||||
- Available modifiers:
|
- Available modifiers:
|
||||||
- string (default if nothing is specified)
|
- string (default if nothing is specified)
|
||||||
|
@ -91,7 +84,112 @@ Timestamp modifiers can be used to convert captures to the timestamp of the
|
||||||
- ts-epochnano (nanoseconds since unix epoch)
|
- ts-epochnano (nanoseconds since unix epoch)
|
||||||
- ts-"CUSTOM"
|
- ts-"CUSTOM"
|
||||||
|
|
||||||
|
|
||||||
CUSTOM time layouts must be within quotes and be the representation of the
|
CUSTOM time layouts must be within quotes and be the representation of the
|
||||||
"reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006`
|
"reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006`
|
||||||
See https://golang.org/pkg/time/#Parse for more details.
|
See https://golang.org/pkg/time/#Parse for more details.
|
||||||
|
|
||||||
|
Telegraf has many of its own
|
||||||
|
[built-in patterns](./grok/patterns/influx-patterns),
|
||||||
|
as well as supporting
|
||||||
|
[logstash's builtin patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns).
|
||||||
|
|
||||||
|
If you need help building patterns to match your logs,
|
||||||
|
you will find the https://grokdebug.herokuapp.com application quite useful!
|
||||||
|
|
||||||
|
#### Timestamp Examples
|
||||||
|
|
||||||
|
This example input and config parses a file using a custom timestamp conversion:
|
||||||
|
|
||||||
|
```
|
||||||
|
2017-02-21 13:10:34 value=42
|
||||||
|
```
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[[inputs.logparser]]
|
||||||
|
[inputs.logparser.grok]
|
||||||
|
patterns = ['%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"} value=%{NUMBER:value:int}']
|
||||||
|
```
|
||||||
|
|
||||||
|
This example parses a file using a built-in conversion and a custom pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
Wed Apr 12 13:10:34 PST 2017 value=42
|
||||||
|
```
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[[inputs.logparser]]
|
||||||
|
[inputs.logparser.grok]
|
||||||
|
patterns = ["%{TS_UNIX:timestamp:ts-unix} value=%{NUMBER:value:int}"]
|
||||||
|
custom_patterns = '''
|
||||||
|
TS_UNIX %{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND} %{TZ} %{YEAR}
|
||||||
|
'''
|
||||||
|
```
|
||||||
|
|
||||||
|
#### TOML Escaping
|
||||||
|
|
||||||
|
When saving patterns to the configuration file, keep in mind the different TOML
|
||||||
|
[string](https://github.com/toml-lang/toml#string) types and the escaping
|
||||||
|
rules for each. These escaping rules must be applied in addition to the
|
||||||
|
escaping required by the grok syntax. Using the Multi-line line literal
|
||||||
|
syntax with `'''` may be useful.
|
||||||
|
|
||||||
|
The following config examples will parse this input file:
|
||||||
|
|
||||||
|
```
|
||||||
|
|42|\uD83D\uDC2F|'telegraf'|
|
||||||
|
```
|
||||||
|
|
||||||
|
Since `|` is a special character in the grok language, we must escape it to
|
||||||
|
get a literal `|`. With a basic TOML string, special characters such as
|
||||||
|
backslash must be escaped, requiring us to escape the backslash a second time.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[[inputs.logparser]]
|
||||||
|
[inputs.logparser.grok]
|
||||||
|
patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
|
||||||
|
custom_patterns = "UNICODE_ESCAPE (?:\\\\u[0-9A-F]{4})+"
|
||||||
|
```
|
||||||
|
|
||||||
|
We cannot use a literal TOML string for the pattern, because we cannot match a
|
||||||
|
`'` within it. However, it works well for the custom pattern.
|
||||||
|
```toml
|
||||||
|
[[inputs.logparser]]
|
||||||
|
[inputs.logparser.grok]
|
||||||
|
patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
|
||||||
|
custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
|
||||||
|
```
|
||||||
|
|
||||||
|
A multi-line literal string allows us to encode the pattern:
|
||||||
|
```toml
|
||||||
|
[[inputs.logparser]]
|
||||||
|
[inputs.logparser.grok]
|
||||||
|
patterns = ['''
|
||||||
|
\|%{NUMBER:value:int}\|%{UNICODE_ESCAPE:escape}\|'%{WORD:name}'\|
|
||||||
|
''']
|
||||||
|
custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tips for creating patterns
|
||||||
|
|
||||||
|
Writing complex patterns can be difficult, here is some advice for writing a
|
||||||
|
new pattern or testing a pattern developed [online](https://grokdebug.herokuapp.com).
|
||||||
|
|
||||||
|
Create a file output that writes to stdout, and disable other outputs while
|
||||||
|
testing. This will allow you to see the captured metrics. Keep in mind that
|
||||||
|
the file output will only print once per `flush_interval`.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[[outputs.file]]
|
||||||
|
files = ["stdout"]
|
||||||
|
```
|
||||||
|
|
||||||
|
- Start with a file containing only a single line of your input.
|
||||||
|
- Remove all but the first token or piece of the line.
|
||||||
|
- Add the section of your pattern to match this piece to your configuration file.
|
||||||
|
- Verify that the metric is parsed successfully by running Telegraf.
|
||||||
|
- If successful, add the next token, update the pattern and retest.
|
||||||
|
- Continue one token at a time until the entire line is successfully parsed.
|
||||||
|
|
||||||
|
### Additional Resources
|
||||||
|
|
||||||
|
- https://www.influxdata.com/telegraf-correlate-log-metrics-data-performance-bottlenecks/
|
||||||
|
|
|
@ -168,6 +168,7 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
|
||||||
}
|
}
|
||||||
|
|
||||||
if len(values) == 0 {
|
if len(values) == 0 {
|
||||||
|
log.Printf("D! Grok no match found for: %q", line)
|
||||||
return nil, nil
|
return nil, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue