Add name, time, path and string field options to JSON parser (#4351)

This commit is contained in:
maxunt
2018-08-22 19:26:48 -07:00
committed by Daniel Nelson
parent d6d6539e26
commit 2729378b7f
15 changed files with 672 additions and 166 deletions

View File

@@ -107,9 +107,31 @@ but can be overridden using the `name_override` config option.
#### JSON Configuration:
The JSON data format supports specifying "tag keys". If specified, keys
will be searched for in the root-level of the JSON blob. If the key(s) exist,
they will be applied as tags to the Telegraf metrics.
The JSON data format supports specifying "tag_keys", "string_keys", and "json_query".
If specified, keys in "tag_keys" and "string_keys" will be searched for in the root-level
and any nested lists of the JSON blob. All int and float values are added to fields by default.
If the key(s) exist, they will be applied as tags or fields to the Telegraf metrics.
If "string_keys" is specified, the string will be added as a field.
The "json_query" configuration is a gjson path to an JSON object or
list of JSON objects. If this path leads to an array of values or
single data point an error will be thrown. If this configuration
is specified, only the result of the query will be parsed and returned as metrics.
The "json_name_key" configuration specifies the key of the field whos value will be
added as the metric name.
Object paths are specified using gjson path format, which is denoted by object keys
concatenated with "." to go deeper in nested JSON objects.
Additional information on gjson paths can be found here: https://github.com/tidwall/gjson#path-syntax
The JSON data format also supports extracting time values through the
config "json_time_key" and "json_time_format". If "json_time_key" is set,
"json_time_format" must be specified. The "json_time_key" describes the
name of the field containing time information. The "json_time_format"
must be a recognized Go time format.
If there is no year provided, the metrics will have the current year.
More info on time formats can be found here: https://golang.org/pkg/time/#Parse
For example, if you had this configuration:
@@ -127,11 +149,28 @@ For example, if you had this configuration:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json"
## List of tag names to extract from top-level of JSON server response
## List of tag names to extract from JSON server response
tag_keys = [
"my_tag_1",
"my_tag_2"
]
## The json path specifying where to extract the metric name from
# json_name_key = ""
## List of field names to extract from JSON and add as string fields
# json_string_fields = []
## gjson query path to specify a specific chunk of JSON to be parsed with
## the above configuration. If not specified, the whole file will be parsed.
## gjson query paths are described here: https://github.com/tidwall/gjson#path-syntax
# json_query = ""
## holds the name of the tag of timestamp
# json_time_key = ""
## holds the format of timestamp to be parsed
# json_time_format = ""
```
with this JSON output from a command:
@@ -152,8 +191,9 @@ Your Telegraf metrics would get tagged with "my_tag_1"
exec_mycollector,my_tag_1=foo a=5,b_c=6
```
If the JSON data is an array, then each element of the array is parsed with the configured settings.
Each resulting metric will be output with the same timestamp.
If the JSON data is an array, then each element of the array is
parsed with the configured settings. Each resulting metric will
be output with the same timestamp.
For example, if the following configuration:
@@ -176,6 +216,19 @@ For example, if the following configuration:
"my_tag_1",
"my_tag_2"
]
## List of field names to extract from JSON and add as string fields
# string_fields = []
## gjson query path to specify a specific chunk of JSON to be parsed with
## the above configuration. If not specified, the whole file will be parsed
# json_query = ""
## holds the name of the tag of timestamp
json_time_key = "b_time"
## holds the format of timestamp to be parsed
json_time_format = "02 Jan 06 15:04 MST"
```
with this JSON output from a command:
@@ -185,7 +238,8 @@ with this JSON output from a command:
{
"a": 5,
"b": {
"c": 6
"c": 6,
"time":"04 Jan 06 15:04 MST"
},
"my_tag_1": "foo",
"my_tag_2": "baz"
@@ -193,7 +247,8 @@ with this JSON output from a command:
{
"a": 7,
"b": {
"c": 8
"c": 8,
"time":"11 Jan 07 15:04 MST"
},
"my_tag_1": "bar",
"my_tag_2": "baz"
@@ -201,11 +256,71 @@ with this JSON output from a command:
]
```
Your Telegraf metrics would get tagged with "my_tag_1" and "my_tag_2"
Your Telegraf metrics would get tagged with "my_tag_1" and "my_tag_2" and fielded with "b_c"
The metric's time will be a time.Time object, as specified by "b_time"
```
exec_mycollector,my_tag_1=foo,my_tag_2=baz a=5,b_c=6
exec_mycollector,my_tag_1=bar,my_tag_2=baz a=7,b_c=8
exec_mycollector,my_tag_1=foo,my_tag_2=baz b_c=6 1136387040000000000
exec_mycollector,my_tag_1=bar,my_tag_2=baz b_c=8 1168527840000000000
```
If you want to only use a specific portion of your JSON, use the "json_query"
configuration to specify a path to a JSON object.
For example, with the following config:
```toml
[[inputs.exec]]
## Commands array
commands = ["/usr/bin/mycollector --foo=bar"]
## measurement name suffix (for separating different commands)
name_suffix = "_mycollector"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json"
## List of tag names to extract from top-level of JSON server response
tag_keys = ["first"]
## List of field names to extract from JSON and add as string fields
string_fields = ["last"]
## gjson query path to specify a specific chunk of JSON to be parsed with
## the above configuration. If not specified, the whole file will be parsed
json_query = "obj.friends"
## holds the name of the tag of timestamp
# json_time_key = ""
## holds the format of timestamp to be parsed
# json_time_format = ""
```
with this JSON as input:
```json
{
"obj": {
"name": {"first": "Tom", "last": "Anderson"},
"age":37,
"children": ["Sara","Alex","Jack"],
"fav.movie": "Deer Hunter",
"friends": [
{"first": "Dale", "last": "Murphy", "age": 44},
{"first": "Roger", "last": "Craig", "age": 68},
{"first": "Jane", "last": "Murphy", "age": 47}
]
}
}
```
You would recieve 3 metrics tagged with "first", and fielded with "last" and "age"
```
exec_mycollector, "first":"Dale" "last":"Murphy","age":44
exec_mycollector, "first":"Roger" "last":"Craig","age":68
exec_mycollector, "first":"Jane" "last":"Murphy","age":47
```
# Value: