Rewrite configuration documentation (#5227)

This commit is contained in:
Daniel Nelson 2019-01-07 14:31:10 -08:00 committed by GitHub
parent 3621bcf5a6
commit 0ceb10e017
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 301 additions and 235 deletions

View File

@ -1,26 +1,37 @@
# Configuration
Telegraf's configuration file is written using
[TOML](https://github.com/toml-lang/toml#toml).
Telegraf's configuration file is written using [TOML][] and is composed of
three sections: [global tags][], [agent][] settings, and [plugins][].
[View the telegraf.conf config file with all available
plugins](/etc/telegraf.conf).
View the default [telegraf.conf][] config file with all available plugins.
## Generating a Configuration File
### Generating a Configuration File
A default config file can be generated by telegraf:
```
```sh
telegraf config > telegraf.conf
```
To generate a file with specific inputs and outputs, you can use the
--input-filter and --output-filter flags:
```
```sh
telegraf --input-filter cpu:mem:net:swap --output-filter influxdb:kafka config
```
### Configuration Loading
The location of the configuration file can be set via the `--config` command
line flag.
When the `--config-directory` command line flag is used files ending with
`.conf` in the specified directory will also be included in the Telegraf
configuration.
On most systems, the default locations are `/etc/telegraf/telegraf.conf` for
the main configuration file and `/etc/telegraf/telegraf.d` for the directory of
configuration files.
### Environment Variables
Environment variables can be used anywhere in the config file, simply prepend
@ -66,81 +77,164 @@ parsed:
password = "monkey123"
```
### Configuration file locations
### Intervals
The location of the configuration file can be set via the `--config` command
line flag.
When the `--config-directory` command line flag is used files ending with
`.conf` in the specified directory will also be included in the Telegraf
configuration.
On most systems, the default locations are `/etc/telegraf/telegraf.conf` for
the main configuration file and `/etc/telegraf/telegraf.d` for the directory of
configuration files.
Intervals are durations of time and can be specified for supporting settings by
combining an integer value and time unit as a string value. Valid time units are
`ns`, `us` (or `µs`), `ms`, `s`, `m`, `h`.
```toml
[agent]
interval = "10s"
```
### Global Tags
Global tags can be specified in the `[global_tags]` section of the config file
in key="value" format. All metrics being gathered on this host will be tagged
with the tags specified here.
Global tags can be specified in the `[global_tags]` table in key="value"
format. All metrics that are gathered will be tagged with the tags specified.
### Agent Configuration
```toml
[global_tags]
dc = "us-east-1"
```
Telegraf has a few options you can configure under the `[agent]` section of the
config.
### Agent
* **interval**: Default data collection interval for all inputs
* **round_interval**: Rounds collection interval to 'interval'
ie, if interval="10s" then always collect on :00, :10, :20, etc.
* **metric_batch_size**: Telegraf will send metrics to output in batch of at
most metric_batch_size metrics.
* **metric_buffer_limit**: Telegraf will cache metric_buffer_limit metrics
for each output, and will flush this buffer on a successful write.
This should be a multiple of metric_batch_size and could not be less
than 2 times metric_batch_size.
* **collection_jitter**: Collection jitter is used to jitter
the collection by a random amount.
Each plugin will sleep for a random time within jitter before collecting.
This can be used to avoid many plugins querying things like sysfs at the
same time, which can have a measurable effect on the system.
* **flush_interval**: Default data flushing interval for all outputs.
You should not set this below
interval. Maximum flush_interval will be flush_interval + flush_jitter
* **flush_jitter**: Jitter the flush interval by a random amount.
This is primarily to avoid
large write spikes for users running a large number of telegraf instances.
ie, a jitter of 5s and flush_interval 10s means flushes will happen every 10-15s.
* **precision**:
By default or when set to "0s", precision will be set to the same
timestamp order as the collection interval, with the maximum being 1s.
Precision will NOT be used for service inputs. It is up to each individual
service input to set the timestamp at the appropriate precision.
Valid time units are "ns", "us" (or "µs"), "ms", "s".
The agent table configures Telegraf and the defaults used across all plugins.
* **logfile**: Specify the log file name. The empty string means to log to stderr.
* **debug**: Run telegraf in debug mode.
* **quiet**: Run telegraf in quiet mode (error messages only).
* **hostname**: Override default hostname, if empty use os.Hostname().
* **omit_hostname**: If true, do no set the "host" tag in the telegraf agent.
- **interval**: Default data collection interval for all inputs.
### Input Configuration
- **round_interval**: Rounds collection interval to 'interval'
ie, if interval="10s" then always collect on :00, :10, :20, etc.
The following config parameters are available for all inputs:
- **metric_batch_size**:
Telegraf will send metrics to outputs in batches of at most
metric_batch_size metrics.
This controls the size of writes that Telegraf sends to output plugins.
* **interval**: How often to gather this metric. Normal plugins use a single
global interval, but if one particular input should be run less or more often,
you can configure that here.
* **name_override**: Override the base name of the measurement.
(Default is the name of the input).
* **name_prefix**: Specifies a prefix to attach to the measurement name.
* **name_suffix**: Specifies a suffix to attach to the measurement name.
* **tags**: A map of tags to apply to a specific input's measurements.
- **metric_buffer_limit**:
For failed writes, telegraf will cache metric_buffer_limit metrics for each
output, and will flush this buffer on a successful write. Oldest metrics
are dropped first when this buffer fills.
This buffer only fills when writes fail to output plugin(s).
The [metric filtering](#metric-filtering) parameters can be used to limit what metrics are
- **collection_jitter**:
Collection jitter is used to jitter the collection by a random amount.
Each plugin will sleep for a random time within jitter before collecting.
This can be used to avoid many plugins querying things like sysfs at the
same time, which can have a measurable effect on the system.
- **flush_interval**:
Default flushing interval for all outputs. Maximum flush_interval will be
flush_interval + flush_jitter
- **flush_jitter**:
Jitter the flush interval by a random amount. This is primarily to avoid
large write spikes for users running a large number of telegraf instances.
ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
- **precision**:
Collected metrics are rounded to the precision specified as a duration.
Precision will NOT be used for service inputs. It is up to each individual
service input to set the timestamp at the appropriate precision.
- **debug**:
Run telegraf with debug log messages.
- **quiet**:
Run telegraf in quiet mode (error log messages only).
- **logfile**:
Specify the log file name. The empty string means to log to stderr.
- **hostname**:
Override default hostname, if empty use os.Hostname()
- **omit_hostname**:
If set to true, do no set the "host" tag in the telegraf agent.
### Plugins
Telegraf plugins are divided into 4 types: [inputs][], [outputs][],
[processors][], and [aggregators][].
Unlike the `global_tags` and `agent` tables, any plugin can be defined
multiple times and each instance will run independantly. This allows you to
have plugins defined with differing configurations as needed within a single
Telegraf process.
Each plugin has a unique set of configuration options, reference the
sample configuration for details. Additionally, several options are available
on any plugin depending on its type.
### Input Plugins
Input plugins gather and create metrics. They support both polling and event
driven operation.
Parameters that can be used with any input plugin:
- **interval**: How often to gather this metric. Normal plugins use a single
global interval, but if one particular input should be run less or more
often, you can configure that here.
- **name_override**: Override the base name of the measurement. (Default is
the name of the input).
- **name_prefix**: Specifies a prefix to attach to the measurement name.
- **name_suffix**: Specifies a suffix to attach to the measurement name.
- **tags**: A map of tags to apply to a specific input's measurements.
The [metric filtering][] parameters can be used to limit what metrics are
emitted from the input plugin.
### Output Configuration
#### Examples
Use the name_suffix parameter to emit measurements with the name `cpu_total`:
```toml
[[inputs.cpu]]
name_suffix = "_total"
percpu = false
totalcpu = true
```
Use the name_override parameter to emit measurements with the name `foobar`:
```toml
[[inputs.cpu]]
name_override = "foobar"
percpu = false
totalcpu = true
```
Emit measurements with two additional tags: `tag1=foo` and `tag2=bar`
> **NOTE**: With TOML, order matters. Parameters belong to the last defined
> table header, place `[inputs.cpu.tags]` table at the _end_ of the plugin
> definition.
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[inputs.cpu.tags]
tag1 = "foo"
tag2 = "bar"
```
Utilize `name_override`, `name_prefix`, or `name_suffix` config options to
avoid measurement collisions when defining multiple plugins:
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[[inputs.cpu]]
percpu = true
totalcpu = false
name_override = "percpu_usage"
fielddrop = ["cpu_time*"]
```
### Output Plugins
Output plugins write metrics to a location. Outputs commonly write to
databases, network services, and messaging systems.
Parameters that can be used with any output plugin:
- **flush_interval**: The maximum time between flushes. Use this setting to
override the agent `flush_interval` on a per plugin basis.
@ -150,42 +244,121 @@ emitted from the input plugin.
Use this setting to override the agent `metric_buffer_limit` on a per plugin
basis.
The [metric filtering](#metric-filtering) parameters can be used to limit what metrics are
The [metric filtering][] parameters can be used to limit what metrics are
emitted from the output plugin.
### Aggregator Configuration
#### Examples
The following config parameters are available for all aggregators:
Override flush parameters for a single output:
```toml
[agent]
flush_interval = "10s"
metric_batch_size = 1000
* **period**: The period on which to flush & clear each aggregator. All metrics
that are sent with timestamps outside of this period will be ignored by the
aggregator.
* **delay**: The delay before each aggregator is flushed. This is to control
how long for aggregators to wait before receiving metrics from input plugins,
in the case that aggregators are flushing and inputs are gathering on the
same interval.
* **drop_original**: If true, the original metric will be dropped by the
aggregator and will not get sent to the output plugins.
* **name_override**: Override the base name of the measurement.
(Default is the name of the input).
* **name_prefix**: Specifies a prefix to attach to the measurement name.
* **name_suffix**: Specifies a suffix to attach to the measurement name.
* **tags**: A map of tags to apply to a specific input's measurements.
[[outputs.influxdb]]
urls = [ "http://example.org:8086" ]
database = "telegraf"
The [metric filtering](#metric-filtering) parameters can be used to limit what metrics are
[[outputs.file]]
files = [ "stdout" ]
flush_interval = "1s"
metric_batch_size = 10
```
### Processor Plugins
Processor plugins perform processing tasks on metrics and are commonly used to
rename or apply transformations to metrics. Processors are applied after the
input plugins and before any aggregator plugins.
Parameters that can be used with any processor plugin:
- **order**: The order in which the processor(s) are executed. If this is not
specified then processor execution order will be random.
The [metric filtering][] parameters can be used to limit what metrics are
handled by the processor. Excluded metrics are passed downstream to the next
processor.
#### Examples
If the order processors are applied matters you must set order on all involved
processors:
```toml
[[processors.rename]]
order = 1
[[processors.rename.replace]]
tag = "path"
dest = "resource"
[[processors.strings]]
order = 2
[[processors.strings.trim_prefix]]
tag = "resource"
prefix = "/api/"
```
### Aggregator Plugins
Aggregator plugins produce new metrics after examining metrics over a time
period, as the name suggests they are commonly used to produce new aggregates
such as mean/max/min metrics. Aggregators operate on metrics after any
processors have been applied.
Parameters that can be used with any aggregator plugin:
- **period**: The period on which to flush & clear each aggregator. All
metrics that are sent with timestamps outside of this period will be ignored
by the aggregator.
- **delay**: The delay before each aggregator is flushed. This is to control
how long for aggregators to wait before receiving metrics from input
plugins, in the case that aggregators are flushing and inputs are gathering
on the same interval.
- **drop_original**: If true, the original metric will be dropped by the
aggregator and will not get sent to the output plugins.
- **name_override**: Override the base name of the measurement. (Default is
the name of the input).
- **name_prefix**: Specifies a prefix to attach to the measurement name.
- **name_suffix**: Specifies a suffix to attach to the measurement name.
- **tags**: A map of tags to apply to a specific input's measurements.
The [metric filtering][] parameters can be used to limit what metrics are
handled by the aggregator. Excluded metrics are passed downstream to the next
aggregator.
### Processor Configuration
#### Examples
The following config parameters are available for all processors:
Collect and emit the min/max of the system load1 metric every 30s, dropping
the originals.
```toml
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
* **order**: This is the order in which the processor(s) get executed. If this
is not specified then processor execution order will be random.
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
The [metric filtering](#metric-filtering) parameters can be used to limit what metrics are
handled by the processor. Excluded metrics are passed downstream to the next
processor.
[[outputs.file]]
files = ["stdout"]
```
Collect and emit the min/max of the swap metrics every 30s, dropping the
originals. The aggregator will not be applied to the system load metrics due
to the `namepass` parameter.
```toml
[[inputs.swap]]
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
namepass = ["swap"] # only "pass" swap metrics through the aggregator.
[[outputs.file]]
files = ["stdout"]
```
<a id="measurement-filtering"></a>
### Metric Filtering
@ -244,39 +417,9 @@ The inverse of `taginclude`. Tags with a tag key matching one of the patterns
will be discarded from the metric. Any tag can be filtered including global
tags and the agent `host` tag.
### Input Configuration Examples
This is a full working config that will output CPU data to an InfluxDB instance
at 192.168.59.103:8086, tagging measurements with dc="denver-1". It will output
measurements at a 10s interval and will collect per-cpu data, dropping any
fields which begin with `time_`.
```toml
[global_tags]
dc = "denver-1"
[agent]
interval = "10s"
# OUTPUTS
[[outputs.influxdb]]
url = "http://192.168.59.103:8086" # required.
database = "telegraf" # required.
# INPUTS
[[inputs.cpu]]
percpu = true
totalcpu = false
# filter all fields beginning with 'time_'
fielddrop = ["time_*"]
```
#### Input Config: tagpass and tagdrop
**NOTE** `tagpass` and `tagdrop` parameters must be defined at the _end_ of
the plugin definition, otherwise subsequent plugin config options will be
interpreted as part of the tagpass/tagdrop map.
##### Filtering Examples
Using tagpass and tagdrop:
```toml
[[inputs.cpu]]
percpu = true
@ -294,8 +437,7 @@ interpreted as part of the tagpass/tagdrop map.
fstype = [ "ext4", "xfs" ]
# Globs can also be used on the tag values
path = [ "/opt", "/home*" ]
[[inputs.win_perf_counters]]
[[inputs.win_perf_counters.object]]
ObjectName = "Network Interface"
@ -308,11 +450,9 @@ interpreted as part of the tagpass/tagdrop map.
# Don't send metrics where the Windows interface name (instance) begins with isatap or Local
[inputs.win_perf_counters.tagdrop]
instance = ["isatap*", "Local*"]
```
#### Input Config: fieldpass and fielddrop
Using fieldpass and fielddrop:
```toml
# Drop all metrics for guest & steal CPU usage
[[inputs.cpu]]
@ -325,8 +465,7 @@ interpreted as part of the tagpass/tagdrop map.
fieldpass = ["inodes*"]
```
#### Input Config: namepass and namedrop
Using namepass and namedrop:
```toml
# Drop all metrics about containers for kubelet
[[inputs.prometheus]]
@ -339,8 +478,7 @@ interpreted as part of the tagpass/tagdrop map.
namepass = ["rest_client_*"]
```
#### Input Config: taginclude and tagexclude
Using taginclude and tagexclude:
```toml
# Only include the "cpu" tag in the measurements for the cpu plugin.
[[inputs.cpu]]
@ -353,64 +491,7 @@ interpreted as part of the tagpass/tagdrop map.
tagexclude = ["fstype"]
```
#### Input config: prefix, suffix, and override
This plugin will emit measurements with the name `cpu_total`
```toml
[[inputs.cpu]]
name_suffix = "_total"
percpu = false
totalcpu = true
```
This will emit measurements with the name `foobar`
```toml
[[inputs.cpu]]
name_override = "foobar"
percpu = false
totalcpu = true
```
#### Input config: tags
This plugin will emit measurements with two additional tags: `tag1=foo` and
`tag2=bar`
NOTE: Order matters, the `[inputs.cpu.tags]` table must be at the _end_ of the
plugin definition.
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[inputs.cpu.tags]
tag1 = "foo"
tag2 = "bar"
```
#### Multiple inputs of the same type
Additional inputs (or outputs) of the same type can be specified,
just define more instances in the config file. It is highly recommended that
you utilize `name_override`, `name_prefix`, or `name_suffix` config options
to avoid measurement collisions:
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[[inputs.cpu]]
percpu = true
totalcpu = false
name_override = "percpu_usage"
fielddrop = ["cpu_time*"]
```
#### Output Configuration Examples:
Metrics can be routed to different outputs using the metric name and tags:
```toml
[[outputs.influxdb]]
urls = [ "http://localhost:8086" ]
@ -432,50 +513,35 @@ to avoid measurement collisions:
cpu = ["cpu0"]
```
#### Aggregator Configuration Examples:
This will collect and emit the min/max of the system load1 metric every
30s, dropping the originals.
Routing metrics to different outputs based on the input. Metrics are tagged
with `influxdb_database` in the input, which is then used to select the
output. The tag is removed in the outputs before writing.
```toml
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
[[outputs.influxdb]]
urls = ["http://influxdb.example.com"]
database = "db_default"
[outputs.influxdb.tagdrop]
influxdb_database = ["*"]
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
[[outputs.influxdb]]
urls = ["http://influxdb.example.com"]
database = "db_other"
tagexclude = ["influxdb_database"]
[ouputs.influxdb.tagpass]
influxdb_database = ["other"]
[[outputs.file]]
files = ["stdout"]
[[inputs.disk]]
[inputs.disk.tags]
influxdb_database = "other"
```
This will collect and emit the min/max of the swap metrics every
30s, dropping the originals. The aggregator will not be applied
to the system load metrics due to the `namepass` parameter.
```toml
[[inputs.swap]]
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
namepass = ["swap"] # only "pass" swap metrics through the aggregator.
[[outputs.file]]
files = ["stdout"]
```
#### Processor Configuration Examples:
Print only the metrics with `cpu` as the measurement name, all metrics are
passed to the output:
```toml
[[processors.printer]]
namepass = "cpu"
[[outputs.file]]
files = ["/tmp/metrics.out"]
```
[TOML]: https://github.com/toml-lang/toml#toml
[global tags]: #global-tags
[agent]: #agent
[plugins]: #plugins
[inputs]: #input-plugins
[outputs]: #output-plugins
[processors]: #processor-plugins
[aggregators]: #aggregator-plugins
[metric filtering]: #metric-filtering
[telegraf.conf]: /etc/telegraf.conf