telegraf/README.md

241 lines
7.6 KiB
Markdown
Raw Normal View History

2015-08-04 21:04:34 +00:00
# Telegraf - A native agent for InfluxDB [![Circle CI](https://circleci.com/gh/influxdb/telegraf.svg?style=svg)](https://circleci.com/gh/influxdb/telegraf)
2015-05-20 05:26:13 +00:00
Telegraf is an agent written in Go for collecting metrics from the system it's
running on or from other services and writing them into InfluxDB.
2015-06-19 16:34:27 +00:00
Design goals are to have a minimal memory footprint with a plugin system so
that developers in the community can easily add support for collecting metrics
from well known services (like Hadoop, or Postgres, or Redis) and third party
APIs (like Mailchimp, AWS CloudWatch, or Google Analytics).
2015-06-19 16:34:27 +00:00
We'll eagerly accept pull requests for new plugins and will manage the set of
plugins that Telegraf supports. See the bottom of this doc for instructions on
writing new plugins.
2015-06-19 16:34:27 +00:00
2015-05-20 05:26:13 +00:00
## Quickstart
2015-06-20 01:53:35 +00:00
* Build from source or download telegraf:
### Linux packages for Debian/Ubuntu and RHEL/CentOS:
2015-06-19 16:34:27 +00:00
```
2015-07-09 18:20:23 +00:00
http://get.influxdb.org/telegraf/telegraf_0.1.4_amd64.deb
http://get.influxdb.org/telegraf/telegraf-0.1.4-1.x86_64.rpm
2015-06-19 16:34:27 +00:00
```
2015-06-20 01:53:35 +00:00
### OSX via Homebrew:
```
brew update
brew install telegraf
```
### How to use it:
* Run `telegraf -sample-config > telegraf.toml` to create an initial configuration
2015-05-20 05:26:13 +00:00
* Edit the configuration to match your needs
* Run `telegraf -config telegraf.toml -test` to output one full measurement sample to STDOUT
* Run `telegraf -config telegraf.toml` to gather and send metrics to InfluxDB
2015-05-20 05:26:13 +00:00
2015-06-20 01:53:35 +00:00
## Telegraf Options
2015-05-20 05:26:13 +00:00
Telegraf has a few options you can configure under the `agent` section of the
config. If you don't see an `agent` section run
`telegraf -sample-config > telegraf.toml` to create a valid initial
configuration:
2015-05-20 05:26:13 +00:00
* **hostname**: The hostname is passed as a tag. By default this will be
the value retured by `hostname` on the machine running Telegraf.
You can override that value here.
* **interval**: How ofter to gather metrics. Uses a simple number +
unit parser, ie "10s" for 10 seconds or "5m" for 5 minutes.
* **debug**: Set to true to gather and send metrics to STDOUT as well as
InfluxDB.
2015-05-20 05:26:13 +00:00
2015-06-19 16:42:29 +00:00
## Supported Plugins
Telegraf currently has support for collecting metrics from:
2015-06-19 16:43:24 +00:00
* System (memory, CPU, network, etc.)
2015-06-19 16:42:29 +00:00
* Docker
* MySQL
* Prometheus (client libraries and exporters)
2015-06-19 16:42:29 +00:00
* PostgreSQL
* Redis
* Elasticsearch
2015-07-09 18:22:10 +00:00
* RethinkDB
* Kafka
2015-07-07 01:20:11 +00:00
* MongoDB
* Disque
* Lustre2
* Memcached
2015-06-19 16:42:29 +00:00
We'll be adding support for many more over the coming months. Read on if you
want to add support for another service or third-party API.
2015-06-19 16:42:29 +00:00
2015-05-20 05:26:13 +00:00
## Plugin Options
There are 3 configuration options that are configurable per plugin:
* **pass**: An array of strings that is used to filter metrics generated by the
current plugin. Each string in the array is tested as a prefix against metrics
and if it matches, the metric is emitted.
2015-05-20 05:26:13 +00:00
* **drop**: The inverse of pass, if a metric matches, it is not emitted.
* **interval**: How often to gather this metric. Normal plugins use a single
global interval, but if one particular plugin should be run less or more often,
you can configure that here.
2015-06-19 16:38:31 +00:00
## Plugins
This section is for developers that want to create new collection plugins.
Telegraf is entirely plugin driven. This interface allows for operators to
2015-06-19 16:38:31 +00:00
pick and chose what is gathered as well as makes it easy for developers
to create new ways of generating metrics.
Plugin authorship is kept as simple as possible to promote people to develop
and submit new plugins.
## Guidelines
* A plugin must conform to the `plugins.Plugin` interface.
* Telegraf promises to run each plugin's Gather function serially. This means
developers don't have to worry about thread safety within these functions.
* Each generated metric automatically has the name of the plugin that generated
it prepended. This is to keep plugins honest.
* Plugins should call `plugins.Add` in their `init` function to register themselves.
See below for a quick example.
* To be available within Telegraf itself, plugins must add themselves to the
`github.com/influxdb/telegraf/plugins/all/all.go` file.
* The `SampleConfig` function should return valid toml that describes how the
plugin can be configured. This is include in `telegraf -sample-config`.
2015-06-19 16:38:31 +00:00
* The `Description` function should say in one line what this plugin does.
### Plugin interface
```go
type Plugin interface {
SampleConfig() string
Description() string
Gather(Accumulator) error
2015-06-19 16:38:31 +00:00
}
type Accumulator interface {
Add(measurement string, value interface{}, tags map[string]string)
AddValuesWithTime(measurement string,
values map[string]interface{},
tags map[string]string,
timestamp time.Time)
2015-06-19 16:38:31 +00:00
}
```
### Accumulator
The way that a plugin emits metrics is by interacting with the Accumulator.
The `Add` function takes 3 arguments:
* **measurement**: A string description of the metric. For instance `bytes_read` or `faults`.
* **value**: A value for the metric. This accepts 5 different types of value:
* **int**: The most common type. All int types are accepted but favor using `int64`
Useful for counters, etc.
* **float**: Favor `float64`, useful for gauges, percentages, etc.
* **bool**: `true` or `false`, useful to indicate the presence of a state. `light_on`, etc.
* **string**: Typically used to indicate a message, or some kind of freeform information.
* **time.Time**: Useful for indicating when a state last occurred, for instance `light_on_since`.
* **tags**: This is a map of strings to strings to describe the where or who
about the metric. For instance, the `net` plugin adds a tag named `"interface"`
set to the name of the network interface, like `"eth0"`.
2015-06-19 16:38:31 +00:00
The `AddValuesWithTime` allows multiple values for a point to be passed. The values
used are the same type profile as **value** above. The **timestamp** argument
allows a point to be registered as having occurred at an arbitrary time.
Let's say you've written a plugin that emits metrics about processes on the current host.
```go
type Process struct {
CPUTime float64
MemoryBytes int64
PID int
2015-06-19 16:38:31 +00:00
}
func Gather(acc plugins.Accumulator) error {
for _, process := range system.Processes() {
tags := map[string]string {
"pid": fmt.Sprintf("%d", process.Pid),
}
acc.Add("cpu", process.CPUTime, tags)
acc.Add("memory", process.MemoryBytes, tags)
}
2015-06-19 16:38:31 +00:00
}
```
### Example
```go
package simple
2015-06-19 16:38:31 +00:00
// simple.go
import "github.com/influxdb/telegraf/plugins"
type Simple struct {
Ok bool
2015-06-19 16:38:31 +00:00
}
func (s *Simple) Description() string {
return "a demo plugin"
2015-06-19 16:38:31 +00:00
}
func (s *Simple) SampleConfig() string {
return "ok = true # indicate if everything is fine"
2015-06-19 16:38:31 +00:00
}
func (s *Simple) Gather(acc plugins.Accumulator) error {
if s.Ok {
acc.Add("state", "pretty good", nil)
} else {
acc.Add("state", "not great", nil)
}
2015-06-19 16:38:31 +00:00
return nil
2015-06-19 16:38:31 +00:00
}
func init() {
plugins.Add("simple", func() plugins.Plugin { return &Simple{} })
2015-06-19 16:38:31 +00:00
}
```
## Testing
### Execute short tests:
execute `make test-short`
### Execute long tests:
As Telegraf collects metrics from several third-party services it becomes a
difficult task to mock each service as some of them have complicated protocols
which would take some time to replicate.
To overcome this situation we've decided to use docker containers to provide a
fast and reproducible environment to test those services which require it.
For other situations
(i.e: https://github.com/influxdb/telegraf/blob/master/plugins/redis/redis_test.go )
a simple mock will suffice.
To execute Telegraf tests follow these simple steps:
- Install docker compose following [these](https://docs.docker.com/compose/install/)
instructions
- mac users should be able to simply do `brew install boot2docker`
and `brew install docker-compose`
- execute `make test`
### Unit test troubleshooting:
Try cleaning up your test environment by executing `make test-cleanup` and
re-running