telegraf/docs/CONFIGURATION.md

588 lines
18 KiB
Markdown

# Configuration
Telegraf's configuration file is written using [TOML][] and is composed of
three sections: [global tags][], [agent][] settings, and [plugins][].
View the default [telegraf.conf][] config file with all available plugins.
### Generating a Configuration File
A default config file can be generated by telegraf:
```sh
telegraf config > telegraf.conf
```
To generate a file with specific inputs and outputs, you can use the
--input-filter and --output-filter flags:
```sh
telegraf --input-filter cpu:mem:net:swap --output-filter influxdb:kafka config
```
### Configuration Loading
The location of the configuration file can be set via the `--config` command
line flag.
When the `--config-directory` command line flag is used files ending with
`.conf` in the specified directory will also be included in the Telegraf
configuration.
On most systems, the default locations are `/etc/telegraf/telegraf.conf` for
the main configuration file and `/etc/telegraf/telegraf.d` for the directory of
configuration files.
### Environment Variables
Environment variables can be used anywhere in the config file, simply surround
them with `${}`. Replacement occurs before file parsing. For strings
the variable must be within quotes, e.g., `"${STR_VAR}"`, for numbers and booleans
they should be unquoted, e.g., `${INT_VAR}`, `${BOOL_VAR}`.
When using the `.deb` or `.rpm` packages, you can define environment variables
in the `/etc/default/telegraf` file.
**Example**:
`/etc/default/telegraf`:
```
USER="alice"
INFLUX_URL="http://localhost:8086"
INFLUX_SKIP_DATABASE_CREATION="true"
INFLUX_PASSWORD="monkey123"
```
`/etc/telegraf.conf`:
```toml
[global_tags]
user = "${USER}"
[[inputs.mem]]
[[outputs.influxdb]]
urls = ["${INFLUX_URL}"]
skip_database_creation = ${INFLUX_SKIP_DATABASE_CREATION}
password = "${INFLUX_PASSWORD}"
```
The above files will produce the following effective configuration file to be
parsed:
```toml
[global_tags]
user = "alice"
[[outputs.influxdb]]
urls = "http://localhost:8086"
skip_database_creation = true
password = "monkey123"
```
### Intervals
Intervals are durations of time and can be specified for supporting settings by
combining an integer value and time unit as a string value. Valid time units are
`ns`, `us` (or `µs`), `ms`, `s`, `m`, `h`.
```toml
[agent]
interval = "10s"
```
### Global Tags
Global tags can be specified in the `[global_tags]` table in key="value"
format. All metrics that are gathered will be tagged with the tags specified.
```toml
[global_tags]
dc = "us-east-1"
```
### Agent
The agent table configures Telegraf and the defaults used across all plugins.
- **interval**: Default data collection [interval][] for all inputs.
- **round_interval**: Rounds collection interval to [interval][]
ie, if interval="10s" then always collect on :00, :10, :20, etc.
- **metric_batch_size**:
Telegraf will send metrics to outputs in batches of at most
metric_batch_size metrics.
This controls the size of writes that Telegraf sends to output plugins.
- **metric_buffer_limit**:
Maximum number of unwritten metrics per output. Increasing this value
allows for longer periods of output downtime without dropping metrics at the
cost of higher maximum memory usage.
- **collection_jitter**:
Collection jitter is used to jitter the collection by a random [interval][].
Each plugin will sleep for a random time within jitter before collecting.
This can be used to avoid many plugins querying things like sysfs at the
same time, which can have a measurable effect on the system.
- **flush_interval**:
Default flushing [interval][] for all outputs. Maximum flush_interval will be
flush_interval + flush_jitter.
- **flush_jitter**:
Default flush jitter for all outputs. This jitters the flush [interval][]
by a random amount. This is primarily to avoid large write spikes for users
running a large number of telegraf instances. ie, a jitter of 5s and interval
10s means flushes will happen every 10-15s.
- **precision**:
Collected metrics are rounded to the precision specified as an [interval][].
Precision will NOT be used for service inputs. It is up to each individual
service input to set the timestamp at the appropriate precision.
- **debug**:
Log at debug level.
- **quiet**:
Log only error level messages.
- **logtarget**:
Log target controls the destination for logs and can be one of "file",
"stderr" or, on Windows, "eventlog". When set to "file", the output file is
determined by the "logfile" setting.
- **logfile**:
Name of the file to be logged to when using the "file" logtarget. If set to
the empty string then logs are written to stderr.
- **logfile_rotation_interval**:
The logfile will be rotated after the time interval specified. When set to
0 no time based rotation is performed.
- **logfile_rotation_max_size**:
The logfile will be rotated when it becomes larger than the specified size.
When set to 0 no size based rotation is performed.
- **logfile_rotation_max_archives**:
Maximum number of rotated archives to keep, any older logs are deleted. If
set to -1, no archives are removed.
- **hostname**:
Override default hostname, if empty use os.Hostname()
- **omit_hostname**:
If set to true, do no set the "host" tag in the telegraf agent.
### Plugins
Telegraf plugins are divided into 4 types: [inputs][], [outputs][],
[processors][], and [aggregators][].
Unlike the `global_tags` and `agent` tables, any plugin can be defined
multiple times and each instance will run independantly. This allows you to
have plugins defined with differing configurations as needed within a single
Telegraf process.
Each plugin has a unique set of configuration options, reference the
sample configuration for details. Additionally, several options are available
on any plugin depending on its type.
### Input Plugins
Input plugins gather and create metrics. They support both polling and event
driven operation.
Parameters that can be used with any input plugin:
- **alias**: Name an instance of a plugin.
- **interval**: How often to gather this metric. Normal plugins use a single
global interval, but if one particular input should be run less or more
often, you can configure that here.
- **name_override**: Override the base name of the measurement. (Default is
the name of the input).
- **name_prefix**: Specifies a prefix to attach to the measurement name.
- **name_suffix**: Specifies a suffix to attach to the measurement name.
- **tags**: A map of tags to apply to a specific input's measurements.
The [metric filtering][] parameters can be used to limit what metrics are
emitted from the input plugin.
#### Examples
Use the name_suffix parameter to emit measurements with the name `cpu_total`:
```toml
[[inputs.cpu]]
name_suffix = "_total"
percpu = false
totalcpu = true
```
Use the name_override parameter to emit measurements with the name `foobar`:
```toml
[[inputs.cpu]]
name_override = "foobar"
percpu = false
totalcpu = true
```
Emit measurements with two additional tags: `tag1=foo` and `tag2=bar`
> **NOTE**: With TOML, order matters. Parameters belong to the last defined
> table header, place `[inputs.cpu.tags]` table at the _end_ of the plugin
> definition.
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[inputs.cpu.tags]
tag1 = "foo"
tag2 = "bar"
```
Utilize `name_override`, `name_prefix`, or `name_suffix` config options to
avoid measurement collisions when defining multiple plugins:
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[[inputs.cpu]]
percpu = true
totalcpu = false
name_override = "percpu_usage"
fielddrop = ["cpu_time*"]
```
### Output Plugins
Output plugins write metrics to a location. Outputs commonly write to
databases, network services, and messaging systems.
Parameters that can be used with any output plugin:
- **alias**: Name an instance of a plugin.
- **flush_interval**: The maximum time between flushes. Use this setting to
override the agent `flush_interval` on a per plugin basis.
- **flush_jitter**: The amount of time to jitter the flush interval. Use this
setting to override the agent `flush_jitter` on a per plugin basis.
- **metric_batch_size**: The maximum number of metrics to send at once. Use
this setting to override the agent `metric_batch_size` on a per plugin basis.
- **metric_buffer_limit**: The maximum number of unsent metrics to buffer.
Use this setting to override the agent `metric_buffer_limit` on a per plugin
basis.
The [metric filtering][] parameters can be used to limit what metrics are
emitted from the output plugin.
#### Examples
Override flush parameters for a single output:
```toml
[agent]
flush_interval = "10s"
flush_jitter = "5s"
metric_batch_size = 1000
[[outputs.influxdb]]
urls = [ "http://example.org:8086" ]
database = "telegraf"
[[outputs.file]]
files = [ "stdout" ]
flush_interval = "1s"
flush_jitter = "1s"
metric_batch_size = 10
```
### Processor Plugins
Processor plugins perform processing tasks on metrics and are commonly used to
rename or apply transformations to metrics. Processors are applied after the
input plugins and before any aggregator plugins.
Parameters that can be used with any processor plugin:
- **alias**: Name an instance of a plugin.
- **order**: The order in which the processor(s) are executed. If this is not
specified then processor execution order will be random.
The [metric filtering][] parameters can be used to limit what metrics are
handled by the processor. Excluded metrics are passed downstream to the next
processor.
#### Examples
If the order processors are applied matters you must set order on all involved
processors:
```toml
[[processors.rename]]
order = 1
[[processors.rename.replace]]
tag = "path"
dest = "resource"
[[processors.strings]]
order = 2
[[processors.strings.trim_prefix]]
tag = "resource"
prefix = "/api/"
```
### Aggregator Plugins
Aggregator plugins produce new metrics after examining metrics over a time
period, as the name suggests they are commonly used to produce new aggregates
such as mean/max/min metrics. Aggregators operate on metrics after any
processors have been applied.
Parameters that can be used with any aggregator plugin:
- **alias**: Name an instance of a plugin.
- **period**: The period on which to flush & clear each aggregator. All
metrics that are sent with timestamps outside of this period will be ignored
by the aggregator.
- **delay**: The delay before each aggregator is flushed. This is to control
how long for aggregators to wait before receiving metrics from input
plugins, in the case that aggregators are flushing and inputs are gathering
on the same interval.
- **grace**: The duration when the metrics will still be aggregated
by the plugin, even though they're outside of the aggregation period. This
is needed in a situation when the agent is expected to receive late metrics
and it's acceptable to roll them up into next aggregation period.
- **drop_original**: If true, the original metric will be dropped by the
aggregator and will not get sent to the output plugins.
- **name_override**: Override the base name of the measurement. (Default is
the name of the input).
- **name_prefix**: Specifies a prefix to attach to the measurement name.
- **name_suffix**: Specifies a suffix to attach to the measurement name.
- **tags**: A map of tags to apply to a specific input's measurements.
The [metric filtering][] parameters can be used to limit what metrics are
handled by the aggregator. Excluded metrics are passed downstream to the next
aggregator.
#### Examples
Collect and emit the min/max of the system load1 metric every 30s, dropping
the originals.
```toml
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
[[outputs.file]]
files = ["stdout"]
```
Collect and emit the min/max of the swap metrics every 30s, dropping the
originals. The aggregator will not be applied to the system load metrics due
to the `namepass` parameter.
```toml
[[inputs.swap]]
[[inputs.system]]
fieldpass = ["load1"] # collects system load1 metric.
[[aggregators.minmax]]
period = "30s" # send & clear the aggregate every 30s.
drop_original = true # drop the original metrics.
namepass = ["swap"] # only "pass" swap metrics through the aggregator.
[[outputs.file]]
files = ["stdout"]
```
<a id="measurement-filtering"></a>
### Metric Filtering
Metric filtering can be configured per plugin on any input, output, processor,
and aggregator plugin. Filters fall under two categories: Selectors and
Modifiers.
#### Selectors
Selector filters include or exclude entire metrics. When a metric is excluded
from a Input or an Output plugin, the metric is dropped. If a metric is
excluded from a Processor or Aggregator plugin, it is skips the plugin and is
sent onwards to the next stage of processing.
- **namepass**:
An array of glob pattern strings. Only metrics whose measurement name matches
a pattern in this list are emitted.
- **namedrop**:
The inverse of `namepass`. If a match is found the metric is discarded. This
is tested on metrics after they have passed the `namepass` test.
- **tagpass**:
A table mapping tag keys to arrays of glob pattern strings. Only metrics
that contain a tag key in the table and a tag value matching one of its
patterns is emitted.
- **tagdrop**:
The inverse of `tagpass`. If a match is found the metric is discarded. This
is tested on metrics after they have passed the `tagpass` test.
#### Modifiers
Modifier filters remove tags and fields from a metric. If all fields are
removed the metric is removed.
- **fieldpass**:
An array of glob pattern strings. Only fields whose field key matches a
pattern in this list are emitted.
- **fielddrop**:
The inverse of `fieldpass`. Fields with a field key matching one of the
patterns will be discarded from the metric. This is tested on metrics after
they have passed the `fieldpass` test.
- **taginclude**:
An array of glob pattern strings. Only tags with a tag key matching one of
the patterns are emitted. In contrast to `tagpass`, which will pass an entire
metric based on its tag, `taginclude` removes all non matching tags from the
metric. Any tag can be filtered including global tags and the agent `host`
tag.
- **tagexclude**:
The inverse of `taginclude`. Tags with a tag key matching one of the patterns
will be discarded from the metric. Any tag can be filtered including global
tags and the agent `host` tag.
##### Filtering Examples
Using tagpass and tagdrop:
```toml
[[inputs.cpu]]
percpu = true
totalcpu = false
fielddrop = ["cpu_time"]
# Don't collect CPU data for cpu6 & cpu7
[inputs.cpu.tagdrop]
cpu = [ "cpu6", "cpu7" ]
[[inputs.disk]]
[inputs.disk.tagpass]
# tagpass conditions are OR, not AND.
# If the (filesystem is ext4 or xfs) OR (the path is /opt or /home)
# then the metric passes
fstype = [ "ext4", "xfs" ]
# Globs can also be used on the tag values
path = [ "/opt", "/home*" ]
[[inputs.win_perf_counters]]
[[inputs.win_perf_counters.object]]
ObjectName = "Network Interface"
Instances = ["*"]
Counters = [
"Bytes Received/sec",
"Bytes Sent/sec"
]
Measurement = "win_net"
# Don't send metrics where the Windows interface name (instance) begins with isatap or Local
[inputs.win_perf_counters.tagdrop]
instance = ["isatap*", "Local*"]
```
Using fieldpass and fielddrop:
```toml
# Drop all metrics for guest & steal CPU usage
[[inputs.cpu]]
percpu = false
totalcpu = true
fielddrop = ["usage_guest", "usage_steal"]
# Only store inode related metrics for disks
[[inputs.disk]]
fieldpass = ["inodes*"]
```
Using namepass and namedrop:
```toml
# Drop all metrics about containers for kubelet
[[inputs.prometheus]]
urls = ["http://kube-node-1:4194/metrics"]
namedrop = ["container_*"]
# Only store rest client related metrics for kubelet
[[inputs.prometheus]]
urls = ["http://kube-node-1:4194/metrics"]
namepass = ["rest_client_*"]
```
Using taginclude and tagexclude:
```toml
# Only include the "cpu" tag in the measurements for the cpu plugin.
[[inputs.cpu]]
percpu = true
totalcpu = true
taginclude = ["cpu"]
# Exclude the "fstype" tag from the measurements for the disk plugin.
[[inputs.disk]]
tagexclude = ["fstype"]
```
Metrics can be routed to different outputs using the metric name and tags:
```toml
[[outputs.influxdb]]
urls = [ "http://localhost:8086" ]
database = "telegraf"
# Drop all measurements that start with "aerospike"
namedrop = ["aerospike*"]
[[outputs.influxdb]]
urls = [ "http://localhost:8086" ]
database = "telegraf-aerospike-data"
# Only accept aerospike data:
namepass = ["aerospike*"]
[[outputs.influxdb]]
urls = [ "http://localhost:8086" ]
database = "telegraf-cpu0-data"
# Only store measurements where the tag "cpu" matches the value "cpu0"
[outputs.influxdb.tagpass]
cpu = ["cpu0"]
```
Routing metrics to different outputs based on the input. Metrics are tagged
with `influxdb_database` in the input, which is then used to select the
output. The tag is removed in the outputs before writing.
```toml
[[outputs.influxdb]]
urls = ["http://influxdb.example.com"]
database = "db_default"
[outputs.influxdb.tagdrop]
influxdb_database = ["*"]
[[outputs.influxdb]]
urls = ["http://influxdb.example.com"]
database = "db_other"
tagexclude = ["influxdb_database"]
[outputs.influxdb.tagpass]
influxdb_database = ["other"]
[[inputs.disk]]
[inputs.disk.tags]
influxdb_database = "other"
```
### Transport Layer Security (TLS)
Reference the detailed [TLS][] documentation.
[TOML]: https://github.com/toml-lang/toml#toml
[global tags]: #global-tags
[interval]: #intervals
[agent]: #agent
[plugins]: #plugins
[inputs]: #input-plugins
[outputs]: #output-plugins
[processors]: #processor-plugins
[aggregators]: #aggregator-plugins
[metric filtering]: #metric-filtering
[telegraf.conf]: /etc/telegraf.conf
[TLS]: /docs/TLS.md