Compare commits

...

61 Commits

Author SHA1 Message Date
Cameron Sparr
215f1b57d0 influxdb output time.Duration fixup 2016-09-28 11:48:46 +01:00
Vinh Quốc Nguyễn
beba50c93b Fix crash when allow pending messgae wasn't set (#1785)
The default is 0 so we hit a division by 0 error and crash. This checks
ensure we will not crash and `log` and continue to let telegraf run

Also we set default allow pending message number to 10000
2016-09-23 11:39:20 +01:00
Cameron Sparr
254fa641d1 Add configurable timeout to influxdb input
closes #1773
2016-09-16 16:20:43 +01:00
Cameron Sparr
16f617dbea Fix unmarshal of influxdb metrics will null tags
closes #1738
2016-09-16 15:58:00 +01:00
Cameron Sparr
532223a9cb Prometheus output: do not remake metrics map each write
closes #1775
2016-09-16 15:54:33 +01:00
Patrick Hemmer
7fac74919c Alternate SNMP plugin (#1389)
* Add a new and improved snmp plugin

* update gosnmp for duplicate packet fix

https://github.com/soniah/gosnmp/issues/68
https://github.com/soniah/gosnmp/pull/69
2016-08-22 16:37:53 +01:00
Robert Kánia
b022b5567d Added missing column, refs #1646 (#1647) 2016-08-22 15:35:39 +01:00
Cameron Sparr
dbf6380e4b update PR template with changelog note 2016-08-17 18:24:06 +01:00
Cameron Sparr
a0e42f8a61 Sanitize graphite characters in field names
also sanitize the names at a higher scope for better clarity

closes #1637
2016-08-17 16:56:31 +01:00
Cameron Sparr
94e673fe85 Revert "add pgbouncer plugin"
This reverts commit fec9760f72.
2016-08-17 16:50:11 +01:00
Cameron Sparr
7600757f16 ntpq: don't index ntp fields that dont exist
closes #1634
2016-08-16 15:16:42 +01:00
Cameron Sparr
4ce8dd5f9a Rename snmp plugin to snmp_legacy 2016-08-11 16:11:35 +01:00
politician
26315bfbea Defines GOOS and GOARCH for windows builds (#1621)
* defines GOOS and GOARCH for windows builds

* default to amd64 on windows

* windows: use latest versions of missing packages
2016-08-11 15:35:00 +01:00
David Bayendor
a282fb8524 Update README.md (#1622)
* Update README.md

Clean up minor typos and syntax.

* Update README.md

Fix typo in 'default'
2016-08-11 09:14:56 +01:00
Jack Zampolin
dee98612e2 Modernize zookeeper readme (#1615)
* Modernize zookeeper readme

* Add configuration
2016-08-10 22:58:47 +01:00
Ross McDonald
69e4e862a3 Fix typo of 'quorom' to 'quorum' when specifying write consistency. (#1618) 2016-08-10 17:51:21 +01:00
Cameron Sparr
c0e895c3a7 etc/telegraf.conf update 2016-08-10 15:16:01 +01:00
jsvisa
fec9760f72 add pgbouncer plugin
add pgbouncer docker for testing

add pgbouncer testcase

update changlog

closes #1400
2016-08-10 15:14:15 +01:00
Rene Zbinden
1989a5855d remove cgo dependeny with forking sensors command
closes #1414
closes #649
2016-08-09 08:38:05 +01:00
Cameron Sparr
abcd19493e If win stat buffer is empty, do not try to index
closes #1425
2016-08-09 08:29:37 +01:00
tuier
e457b7a8df Source improvement for librato output (#1416)
* Source improvement for librato output

Build the source from the list of tag instead of a configuration specified
single tag

Graphite Serializer:
* make buildTags public
* make sure not to use empty tags

Librato output:
* Improve Error handling for librato API base on error or debug flag
* Send Metric per Batch (max 300)
* use Graphite BuildTag function to generate source

The change is made that it should be retro compatible

Metric sample:
server=127.0.0.1 port=80 state=leader env=test
measurement.metric_name value
service_n.metric_x

Metric before with source tags set as "server":
source=127.0.0.1
test.80.127_0_0_1.leader.measurement.metric_name
test.80.127_0_0_1.leader.service_n.metric_x

Metric now:
source=test.80.127.0.0.1.leader
measurement.metric_name
service_n.metric_x

As you can see the source in the "new" version is much more precise
That way when filter (only from source) you can filter by env or any other tags

* Using template to specify which tagsusing for source, default concat all
tags

* revert change in graphite serializer

* better documentation, change default for template

* fmt

* test passing with new host as default tags

* use host tag in api integration test

* Limit 80 char per line, change resolution to be a int in the sample

* fmt

* remove resolution, doc for template

* fmt
2016-08-09 08:29:15 +01:00
Mariusz Brzeski
3853d0d065 Fix problem with metrics when ping return Destination net unreachable ( windows ) (#1561)
* Fix problem with metrics when ping return Destination net unreachable
Add test case TestUnreachablePingGather
Add percent_reply_loss
Fix some other tests

* Add errors measurment

* fir problem with ping reply "TTL expired in transit" ( use regex for more specific condition - TTL in line but it's a not valid replay )
add test case for "TTL expired in transit" - TestTTLExpiredPingGather
2016-08-09 08:27:30 +01:00
Patrick Hemmer
53e31cf1b5 Fix postgres extensible text (#1601)
* convert postgresql_extensible byte slice values to strings

* code cleanup in postgresql_extensible
2016-08-09 08:25:59 +01:00
Cameron Sparr
c99c22534b influxdb output: config doc update 2016-08-09 07:50:35 +01:00
Cameron Sparr
8e22526756 Adding c:\program files\telegraf\telegraf.conf
this will now be the default config file location on windows, basically
it is the windows equivalent of /etc/telegraf/telegraf.conf

also updating the changelog

closes #1543
2016-08-08 23:17:27 +01:00
Dennis Bellinger
7b6713b094 Telegraf support for built-in windows service.
Updated windows dependencies

Updated the windows dependencies so that the versions matched the
dependencies for Mac OS and Linux. Additionally added some that were
complained about being missing at compile time.

Incorporated kardianos/service for management

Incorporated the library github.com/kardianos/service to manage the
service on the various platforms (including Windows). This required an
alternate main function.

The original main function was renamed to reloadLoop (as that is what
the main loop in it does) (it also got a couple of parameters). The
service management library calls it as the main body of the program.

Merged service.go into telegraf.go

Due to compilation issues on Windows, moved the code from service.go
into telegraf.go and removed service.go entirely.

Updated dependencies and fixed Windows service

Updated the dependencies so that it builds properly on Windows,
additionally, fixed the registered command for starting it as
a service (needed to add the config file option). This currently
standardizes it as a C:\telegraf\telegraf.conf on Windows.

Added dependency for github.com/kardianos/service

Removed the common dependencies from _windows file

Removed all the common dependencies from the Godeps_windows file and
modified Makefile to load Godeps and then Godeps_windows when building
for Windows. This should reduce problems caused by the Godeps_windows
file being forgotten when updating dependencies.

Updated CHANGELOG.md with changes

Ran `go fmt ./...` to format code

Removed service library on all but Windows

The service library [kardianos/service](github.com/kardianos/service)
has been disabled on all platforms but windows, as there is already
existing infrastructure for other platforms.

Removed the dependency line for itself

It appears that gdm accidentally added the project itself to the
dependency list. This caused the dependency restoration to select an
earlier version of the project during build.

This only affected windows.
This only affected builds after 020b2c70

Updated documentation for Windows Service

Removed the documentation about using NSSM and added documentation on
installing telegraf directly as a Windows Service.

Added license info for kardianos/service

Added the license information for github.com/kardianos/service which is
licensed under the ZLib license, although that name is never mentioned
the license text matches word for word.

Changed the Windows Config file default location

Updated the default location of the configuration file on Windows from
C:\telegraf\telegraf.conf to C:\Program Files\Telegraf\telegraf.conf.
With this change includes updating the directions, including directing
that the executable be put into that same directory. Additionally, as
noted in the instructions, the location of the config file for the
service may be changed by specifying the location with the `-config`
flag at install time.

Fixed bug - Wrong data type: svcConfig

svcConfig service.Config => svcConfig *service.Config
(It needed to be a pointer)
2016-08-08 23:10:39 +01:00
Jack Zampolin
b0ef506a88 Add Kafka output readme (#1609) 2016-08-08 23:10:07 +01:00
Jack Zampolin
22c293de62 Add request for sample queries (#1608) 2016-08-08 23:06:03 +01:00
Cameron Sparr
d3bb1e7010 Rename internal_models package to models 2016-08-08 14:41:40 +01:00
Cameron Sparr
49988b15a3 Default config typo fix 2016-08-06 07:40:28 +01:00
Cameron Sparr
f0357b7a12 CHANGELOG formatting update
put all 1.0 beta releases into a single 1.0 release manifest

also add #1586 change
2016-08-05 14:51:19 +01:00
Cameron Sparr
9d3ad6309e Remove IF NOT EXISTS from influxdb output 2016-08-05 13:55:02 +01:00
Cameron Sparr
b55e9e78e3 gopsutil, fix /proc/pid/io naming issue
closes #1584
2016-08-05 09:53:14 +01:00
Cameron Sparr
4bc6fdb09e Removing INFLUXDB_HTTP_LOG from logparser usage/docs
this log format is likely soon going to be removed from a future
influxdb release, so we should not be recommending that users base any
of their log parsing infra on this.
2016-08-04 16:42:59 +01:00
Cameron Sparr
2b43b385de Begin implementing generic timestamp logparser capability 2016-08-04 16:08:55 +01:00
Cameron Sparr
13865f9e04 Disable darwin builds (#1571)
telegraf can't be cross-compiled for darwin, it has C dependencies and
thus many of the system plugins won't work.
2016-08-04 14:27:33 +01:00
Jack Zampolin
497353e586 add call to action for plugin contribuitors to write tickscripts (#1580) 2016-08-04 14:27:06 +01:00
Cameron Sparr
2d86dfba8b Removing deprecated flags
they are:
  -configdirectory
  -outputfilter
  -filter
2016-08-03 13:08:06 +01:00
Cameron Sparr
30dbfd9af8 Fix racy tail from beginning test 2016-07-28 14:08:12 +01:00
Cameron Sparr
c991b579d2 tcp/udp listeners, remove locks & improve test coverage 2016-07-28 13:42:34 +01:00
Srini Chebrolu
841729c0f9 RPM post remove script update for proper handle on all Linux distributions (#1381) 2016-07-28 08:34:57 +01:00
Victor Garcia
412f5b5acb Fixing changelog, MongoDB stats per db feature not release in 1.0beta3 (#1548) 2016-07-26 19:15:40 +01:00
Mariusz Brzeski
0b3958d3cd Ping windows (#1532)
* Ping for windows

* En ping output

* Code format

* Code review

* Default timeout

* Fix problem with std error when no data received ( exit status = 1 )
2016-07-25 13:17:41 +01:00
Patrick Hemmer
e68f251df7 add AddError method to accumulator (#1536) 2016-07-25 13:09:49 +01:00
Jason Gardner
986735234b Fix output config typo. (#1527) 2016-07-22 16:05:53 +01:00
Patrick Hemmer
4363eebc1b update gopsutil for FreeBSD disk time metrics (#1534)
Results in adding the io_time metric to FreeBSD, and adjusts the read_time and write_time metrics to be in milliseconds like linux.
2016-07-22 09:23:45 +01:00
Patrick Hemmer
1be6ea5696 remove unused accumulator.prefix (#1535) 2016-07-22 09:22:52 +01:00
Cameron Sparr
8acda0da8f Update etc/telegraf.conf 2016-07-21 17:53:41 +01:00
Łukasz Harasimowicz
ee240a5599 Added metrics for Mesos slaves and tasks running on them.
closes #1356
2016-07-21 17:13:00 +01:00
Mendelson Gusmão
29ea433763 Implement support for fetching hddtemp data (#1411) 2016-07-21 17:00:54 +01:00
Pierre Fersing
0462af164e Added option "total/perdevice" to Docker input (#1525)
Like cpu plugin, add two option "total" and "perdevice" to send network
and diskio metrics either per device and/or the sum of all devices.
2016-07-21 16:50:12 +01:00
Cameron Sparr
1c24665b29 Prometheus client & win_perf_counters char changes
1. in prometheus client, do not check for invalid characters anymore,
because we are already replacing all invalid characters with regex
anyways.
2. in win_perf_counters, sanitize field name _and_ measurement name.
Also add '%' to the list of sanitized characters, because this character
is invalid for most output plugins, and can also easily cause string
formatting issues throughout the stack.
3. All '%' will now be translated to 'Percent'

closes #1430
2016-07-21 16:24:19 +01:00
Torsten Rehn
0af0fa7c2e jolokia: handle multiple multi-dimensional attributes (#1524)
fixes #1481
2016-07-20 14:47:04 +01:00
Cameron Sparr
191608041f Strip container_version from container_image tag
closes #1413
2016-07-19 17:57:40 +01:00
Pierre Fersing
42d9d5d237 Fix Redis url, an extra "tcp://" was added (#1521) 2016-07-19 15:24:10 +01:00
Cameron Sparr
d54b169d67 nstat: fix nstat setting path for snmp6
closes #1477
2016-07-19 14:51:36 +01:00
Cameron Sparr
82166a36d0 Fix err race condition and partial failure issues
closes #1439
closes #1440
closes #1441
closes #1442
closes #1443
closes #1444
closes #1445
2016-07-19 14:45:55 +01:00
Victor Garcia
cbf5a55c7d MongoDB input plugin: Adding per DB stats (#1466) 2016-07-19 12:47:12 +01:00
Cameron Sparr
5f14ad9fa1 clean up and finish aerospike refactor & readme 2016-07-19 11:36:41 +01:00
Timothée GERMAIN
0be69b8a44 Make the user able to specify full path for HAproxy stats
closes #1499
closes #1019

Do no try to guess HAproxy stats url, just add ";csv" at the end of the
url if not present.

Signed-off-by: tgermain <timothee.germain@corp.ovh.com>
2016-07-19 11:35:15 +01:00
Matt Jones
375710488d Add support for self-signed certs to RabbitMQ input plugin (#1503)
* add initial support to allow self-signed certs

When using self-signed the metrics collection will fail, this will allow
the user to specify in the input configuration file if they want to skip
certificate verification. This is functionally identical to `curl -k`

At some point this functionality should be moved to the agent as it is
already implemented identically in several different input plugins.

* Add initial comment strings to remove noise

These should be properly fleshed out at some point to ensure
code completeness

* refactor to use generic helper function

* fix import statement against fork

* update changelog
2016-07-19 10:24:06 +01:00
106 changed files with 7471 additions and 3037 deletions

View File

@@ -1,5 +1,5 @@
### Required for all PRs:
- [ ] CHANGELOG.md updated
- [ ] CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
- [ ] Sign [CLA](https://influxdata.com/community/cla/) (if not already signed)
- [ ] README.md updated (if adding a new plugin)

View File

@@ -1,9 +1,59 @@
## v1.0 [unreleased]
## v1.0 beta 3 [2016-07-18]
## v1.1 [unreleased]
### Release Notes
### Features
- [#1694](https://github.com/influxdata/telegraf/pull/1694): Adding Gauge and Counter metric types.
- [#1606](https://github.com/influxdata/telegraf/pull/1606): Remove carraige returns from exec plugin output on Windows
- [#1674](https://github.com/influxdata/telegraf/issues/1674): elasticsearch input: configurable timeout.
- [#1607](https://github.com/influxdata/telegraf/pull/1607): Massage metric names in Instrumental output plugin
- [#1572](https://github.com/influxdata/telegraf/pull/1572): mesos improvements.
- [#1513](https://github.com/influxdata/telegraf/issues/1513): Add Ceph Cluster Performance Statistics
- [#1650](https://github.com/influxdata/telegraf/issues/1650): Ability to configure response_timeout in httpjson input.
- [#1685](https://github.com/influxdata/telegraf/issues/1685): Add additional redis metrics.
- [#1539](https://github.com/influxdata/telegraf/pull/1539): Added capability to send metrics through Http API for OpenTSDB.
- [#1471](https://github.com/influxdata/telegraf/pull/1471): iptables input plugin.
- [#1542](https://github.com/influxdata/telegraf/pull/1542): Add filestack webhook plugin.
- [#1599](https://github.com/influxdata/telegraf/pull/1599): Add server hostname for each docker measurements.
- [#1697](https://github.com/influxdata/telegraf/pull/1697): Add NATS output plugin.
- [#1407](https://github.com/influxdata/telegraf/pull/1407): HTTP service listener input plugin.
- [#1699](https://github.com/influxdata/telegraf/pull/1699): Add database blacklist option for Postgresql
### Bugfixes
- [#1628](https://github.com/influxdata/telegraf/issues/1628): Fix mongodb input panic on version 2.2.
- [#1738](https://github.com/influxdata/telegraf/issues/1738): Fix unmarshal of influxdb metrics with null tags
- [#1733](https://github.com/influxdata/telegraf/issues/1733): Fix statsd scientific notation parsing
- [#1716](https://github.com/influxdata/telegraf/issues/1716): Sensors plugin strconv.ParseFloat: parsing "": invalid syntax
- [#1530](https://github.com/influxdata/telegraf/issues/1530): Fix prometheus_client reload panic
- [#1764](https://github.com/influxdata/telegraf/issues/1764): Fix kafka consumer panic when nil error is returned down errs channel.
## v1.0.1 [unreleased]
### Bugfixes
- [#1775](https://github.com/influxdata/telegraf/issues/1775): Prometheus output: Fix bug with multi-batch writes.
- [#1738](https://github.com/influxdata/telegraf/issues/1738): Fix unmarshal of influxdb metrics with null tags.
- [#1773](https://github.com/influxdata/telegraf/issues/1773): Add configurable timeout to influxdb input plugin.
## v1.0 [2016-09-08]
### Release Notes
**Breaking Change** The SNMP plugin is being deprecated in it's current form.
There is a [new SNMP plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp)
which fixes many of the issues and confusions
of it's predecessor. For users wanting to continue to use the deprecated SNMP
plugin, you will need to change your config file from `[[inputs.snmp]]` to
`[[inputs.snmp_legacy]]`. The configuration of the new SNMP plugin is _not_
backwards-compatible.
- Telegraf now supports being installed as an official windows service,
which can be installed via
`> C:\Program Files\Telegraf\telegraf.exe --service install`
**Breaking Change**: Aerospike main server node measurements have been renamed
aerospike_node. Aerospike namespace measurements have been renamed to
aerospike_namespace. They will also now be tagged with the node_name
@@ -34,8 +84,16 @@ should now look like:
path = "/"
```
- `flush_jitter` behavior has been changed. The random jitter will now be
evaluated at every flush interval, rather than once at startup. This makes it
consistent with the behavior of `collection_jitter`.
### Features
- [#1413](https://github.com/influxdata/telegraf/issues/1413): Separate container_version from container_image tag.
- [#1525](https://github.com/influxdata/telegraf/pull/1525): Support setting per-device and total metrics for Docker network and blockio.
- [#1466](https://github.com/influxdata/telegraf/pull/1466): MongoDB input plugin: adding per DB stats from db.stats()
- [#1503](https://github.com/influxdata/telegraf/pull/1503): Add tls support for certs to RabbitMQ input plugin
- [#1289](https://github.com/influxdata/telegraf/pull/1289): webhooks input plugin. Thanks @francois2metz and @cduez!
- [#1247](https://github.com/influxdata/telegraf/pull/1247): rollbar webhook plugin.
- [#1408](https://github.com/influxdata/telegraf/pull/1408): mandrill webhook plugin.
@@ -48,9 +106,41 @@ should now look like:
- [#1500](https://github.com/influxdata/telegraf/pull/1500): Aerospike plugin refactored to use official client lib.
- [#1434](https://github.com/influxdata/telegraf/pull/1434): Add measurement name arg to logparser plugin.
- [#1479](https://github.com/influxdata/telegraf/pull/1479): logparser: change resp_code from a field to a tag.
- [#1411](https://github.com/influxdata/telegraf/pull/1411): Implement support for fetching hddtemp data
- [#1340](https://github.com/influxdata/telegraf/issues/1340): statsd: do not log every dropped metric.
- [#1368](https://github.com/influxdata/telegraf/pull/1368): Add precision rounding to all metrics on collection.
- [#1390](https://github.com/influxdata/telegraf/pull/1390): Add support for Tengine
- [#1320](https://github.com/influxdata/telegraf/pull/1320): Logparser input plugin for parsing grok-style log patterns.
- [#1397](https://github.com/influxdata/telegraf/issues/1397): ElasticSearch: now supports connecting to ElasticSearch via SSL
- [#1262](https://github.com/influxdata/telegraf/pull/1261): Add graylog input pluging.
- [#1294](https://github.com/influxdata/telegraf/pull/1294): consul input plugin. Thanks @harnash
- [#1164](https://github.com/influxdata/telegraf/pull/1164): conntrack input plugin. Thanks @robinpercy!
- [#1165](https://github.com/influxdata/telegraf/pull/1165): vmstat input plugin. Thanks @jshim-xm!
- [#1208](https://github.com/influxdata/telegraf/pull/1208): Standardized AWS credentials evaluation & wildcard CloudWatch dimensions. Thanks @johnrengelman!
- [#1264](https://github.com/influxdata/telegraf/pull/1264): Add SSL config options to http_response plugin.
- [#1272](https://github.com/influxdata/telegraf/pull/1272): graphite parser: add ability to specify multiple tag keys, for consistency with influxdb parser.
- [#1265](https://github.com/influxdata/telegraf/pull/1265): Make dns lookups for chrony configurable. Thanks @zbindenren!
- [#1275](https://github.com/influxdata/telegraf/pull/1275): Allow wildcard filtering of varnish stats.
- [#1142](https://github.com/influxdata/telegraf/pull/1142): Support for glob patterns in exec plugin commands configuration.
- [#1278](https://github.com/influxdata/telegraf/pull/1278): RabbitMQ input: made url parameter optional by using DefaultURL (http://localhost:15672) if not specified
- [#1197](https://github.com/influxdata/telegraf/pull/1197): Limit AWS GetMetricStatistics requests to 10 per second.
- [#1278](https://github.com/influxdata/telegraf/pull/1278) & [#1288](https://github.com/influxdata/telegraf/pull/1288) & [#1295](https://github.com/influxdata/telegraf/pull/1295): RabbitMQ/Apache/InfluxDB inputs: made url(s) parameter optional by using reasonable input defaults if not specified
- [#1296](https://github.com/influxdata/telegraf/issues/1296): Refactor of flush_jitter argument.
- [#1213](https://github.com/influxdata/telegraf/issues/1213): Add inactive & active memory to mem plugin.
- [#1543](https://github.com/influxdata/telegraf/pull/1543): Official Windows service.
- [#1414](https://github.com/influxdata/telegraf/pull/1414): Forking sensors command to remove C package dependency.
- [#1389](https://github.com/influxdata/telegraf/pull/1389): Add a new SNMP plugin.
### Bugfixes
- [#1619](https://github.com/influxdata/telegraf/issues/1619): Fix `make windows` build target
- [#1519](https://github.com/influxdata/telegraf/pull/1519): Fix error race conditions and partial failures.
- [#1477](https://github.com/influxdata/telegraf/issues/1477): nstat: fix inaccurate config panic.
- [#1481](https://github.com/influxdata/telegraf/issues/1481): jolokia: fix handling multiple multi-dimensional attributes.
- [#1430](https://github.com/influxdata/telegraf/issues/1430): Fix prometheus character sanitizing. Sanitize more win_perf_counters characters.
- [#1534](https://github.com/influxdata/telegraf/pull/1534): Add diskio io_time to FreeBSD & report timing metrics as ms (as linux does).
- [#1379](https://github.com/influxdata/telegraf/issues/1379): Fix covering Amazon Linux for post remove flow.
- [#1584](https://github.com/influxdata/telegraf/issues/1584): procstat missing fields: read/write bytes & count
- [#1472](https://github.com/influxdata/telegraf/pull/1472): diskio input plugin: set 'skip_serial_number = true' by default to avoid high cardinality.
- [#1426](https://github.com/influxdata/telegraf/pull/1426): nil metrics panic fix.
- [#1384](https://github.com/influxdata/telegraf/pull/1384): Fix datarace in apache input plugin.
@@ -67,19 +157,8 @@ should now look like:
- [#1463](https://github.com/influxdata/telegraf/issues/1463): Shared WaitGroup in Exec plugin
- [#1436](https://github.com/influxdata/telegraf/issues/1436): logparser: honor modifiers in "pattern" config.
- [#1418](https://github.com/influxdata/telegraf/issues/1418): logparser: error and exit on file permissions/missing errors.
## v1.0 beta 2 [2016-06-21]
### Features
- [#1340](https://github.com/influxdata/telegraf/issues/1340): statsd: do not log every dropped metric.
- [#1368](https://github.com/influxdata/telegraf/pull/1368): Add precision rounding to all metrics on collection.
- [#1390](https://github.com/influxdata/telegraf/pull/1390): Add support for Tengine
- [#1320](https://github.com/influxdata/telegraf/pull/1320): Logparser input plugin for parsing grok-style log patterns.
- [#1397](https://github.com/influxdata/telegraf/issues/1397): ElasticSearch: now supports connecting to ElasticSearch via SSL
### Bugfixes
- [#1499](https://github.com/influxdata/telegraf/pull/1499): Make the user able to specify full path for HAproxy stats
- [#1521](https://github.com/influxdata/telegraf/pull/1521): Fix Redis url, an extra "tcp://" was added.
- [#1330](https://github.com/influxdata/telegraf/issues/1330): Fix exec plugin panic when using single binary.
- [#1336](https://github.com/influxdata/telegraf/issues/1336): Fixed incorrect prometheus metrics source selection.
- [#1112](https://github.com/influxdata/telegraf/issues/1112): Set default Zookeeper chroot to empty string.
@@ -87,50 +166,6 @@ should now look like:
- [#1374](https://github.com/influxdata/telegraf/pull/1374): Change "default" retention policy to "".
- [#1377](https://github.com/influxdata/telegraf/issues/1377): Graphite output mangling '%' character.
- [#1396](https://github.com/influxdata/telegraf/pull/1396): Prometheus input plugin now supports x509 certs authentication
## v1.0 beta 1 [2016-06-07]
### Release Notes
- `flush_jitter` behavior has been changed. The random jitter will now be
evaluated at every flush interval, rather than once at startup. This makes it
consistent with the behavior of `collection_jitter`.
- All AWS plugins now utilize a standard mechanism for evaluating credentials.
This allows all AWS plugins to support environment variables, shared credential
files & profiles, and role assumptions. See the specific plugin README for
details.
- The AWS CloudWatch input plugin can now declare a wildcard value for a metric
dimension. This causes the plugin to read all metrics that contain the specified
dimension key regardless of value. This is used to export collections of metrics
without having to know the dimension values ahead of time.
- The AWS CloudWatch input plugin can now be configured with the `cache_ttl`
attribute. This configures the TTL of the internal metric cache. This is useful
in conjunction with wildcard dimension values as it will control the amount of
time before a new metric is included by the plugin.
### Features
- [#1262](https://github.com/influxdata/telegraf/pull/1261): Add graylog input pluging.
- [#1294](https://github.com/influxdata/telegraf/pull/1294): consul input plugin. Thanks @harnash
- [#1164](https://github.com/influxdata/telegraf/pull/1164): conntrack input plugin. Thanks @robinpercy!
- [#1165](https://github.com/influxdata/telegraf/pull/1165): vmstat input plugin. Thanks @jshim-xm!
- [#1208](https://github.com/influxdata/telegraf/pull/1208): Standardized AWS credentials evaluation & wildcard CloudWatch dimensions. Thanks @johnrengelman!
- [#1264](https://github.com/influxdata/telegraf/pull/1264): Add SSL config options to http_response plugin.
- [#1272](https://github.com/influxdata/telegraf/pull/1272): graphite parser: add ability to specify multiple tag keys, for consistency with influxdb parser.
- [#1265](https://github.com/influxdata/telegraf/pull/1265): Make dns lookups for chrony configurable. Thanks @zbindenren!
- [#1275](https://github.com/influxdata/telegraf/pull/1275): Allow wildcard filtering of varnish stats.
- [#1142](https://github.com/influxdata/telegraf/pull/1142): Support for glob patterns in exec plugin commands configuration.
- [#1278](https://github.com/influxdata/telegraf/pull/1278): RabbitMQ input: made url parameter optional by using DefaultURL (http://localhost:15672) if not specified
- [#1197](https://github.com/influxdata/telegraf/pull/1197): Limit AWS GetMetricStatistics requests to 10 per second.
- [#1278](https://github.com/influxdata/telegraf/pull/1278) & [#1288](https://github.com/influxdata/telegraf/pull/1288) & [#1295](https://github.com/influxdata/telegraf/pull/1295): RabbitMQ/Apache/InfluxDB inputs: made url(s) parameter optional by using reasonable input defaults if not specified
- [#1296](https://github.com/influxdata/telegraf/issues/1296): Refactor of flush_jitter argument.
- [#1213](https://github.com/influxdata/telegraf/issues/1213): Add inactive & active memory to mem plugin.
### Bugfixes
- [#1252](https://github.com/influxdata/telegraf/pull/1252) & [#1279](https://github.com/influxdata/telegraf/pull/1279): Fix systemd service. Thanks @zbindenren & @PierreF!
- [#1221](https://github.com/influxdata/telegraf/pull/1221): Fix influxdb n_shards counter.
- [#1258](https://github.com/influxdata/telegraf/pull/1258): Fix potential kernel plugin integer parse error.
@@ -140,6 +175,11 @@ time before a new metric is included by the plugin.
- [#1316](https://github.com/influxdata/telegraf/pull/1316): Removed leaked "database" tag on redis metrics. Thanks @PierreF!
- [#1323](https://github.com/influxdata/telegraf/issues/1323): Processes plugin: fix potential error with /proc/net/stat directory.
- [#1322](https://github.com/influxdata/telegraf/issues/1322): Fix rare RHEL 5.2 panic in gopsutil diskio gathering function.
- [#1586](https://github.com/influxdata/telegraf/pull/1586): Remove IF NOT EXISTS from influxdb output database creation.
- [#1600](https://github.com/influxdata/telegraf/issues/1600): Fix quoting with text values in postgresql_extensible plugin.
- [#1425](https://github.com/influxdata/telegraf/issues/1425): Fix win_perf_counter "index out of range" panic.
- [#1634](https://github.com/influxdata/telegraf/issues/1634): Fix ntpq panic when field is missing.
- [#1637](https://github.com/influxdata/telegraf/issues/1637): Sanitize graphite output field names.
## v0.13.1 [2016-05-24]

View File

@@ -11,6 +11,8 @@ Output plugins READMEs are less structured,
but any information you can provide on how the data will look is appreciated.
See the [OpenTSDB output](https://github.com/influxdata/telegraf/tree/master/plugins/outputs/opentsdb)
for a good example.
1. **Optional:** Help users of your plugin by including example queries for populating dashboards. Include these sample queries in the `README.md` for the plugin.
1. **Optional:** Write a [tickscript](https://docs.influxdata.com/kapacitor/v1.0/tick/syntax/) for your plugin and add it to [Kapacitor](https://github.com/influxdata/kapacitor/tree/master/examples/telegraf). Or mention @jackzampolin in a PR comment with some common queries that you would want to alert on and he will write one for you.
## GoDoc

6
Godeps
View File

@@ -29,6 +29,8 @@ github.com/hpcloud/tail b2940955ab8b26e19d43a43c4da0475dd81bdb56
github.com/influxdata/config b79f6829346b8d6e78ba73544b1e1038f1f1c9da
github.com/influxdata/influxdb e094138084855d444195b252314dfee9eae34cab
github.com/influxdata/toml af4df43894b16e3fd2b788d01bd27ad0776ef2d0
github.com/kardianos/osext 29ae4ffbc9a6fe9fb2bc5029050ce6996ea1d3bc
github.com/kardianos/service 5e335590050d6d00f3aa270217d288dda1c94d0a
github.com/klauspost/crc32 19b0b332c9e4516a6370a0456e6182c3b5036720
github.com/lib/pq e182dc4027e2ded4b19396d638610f2653295f36
github.com/matttproud/golang_protobuf_extensions d0c3fe89de86839aecf2e0579c40ba3bb336a453
@@ -44,8 +46,8 @@ github.com/prometheus/client_model fa8ad6fec33561be4280a8f0514318c79d7f6cb6
github.com/prometheus/common e8eabff8812b05acf522b45fdcd725a785188e37
github.com/prometheus/procfs 406e5b7bfd8201a36e2bb5f7bdae0b03380c2ce8
github.com/samuel/go-zookeeper 218e9c81c0dd8b3b18172b2bbfad92cc7d6db55f
github.com/shirou/gopsutil 586bb697f3ec9f8ec08ffefe18f521a64534037c
github.com/soniah/gosnmp b1b4f885b12c5dcbd021c5cee1c904110de6db7d
github.com/shirou/gopsutil 4d0c402af66c78735c5ccf820dc2ca7de5e4ff08
github.com/soniah/gosnmp eb32571c2410868d85849ad67d1e51d01273eb84
github.com/sparrc/aerospike-client-go d4bb42d2c2d39dae68e054116f4538af189e05d5
github.com/streadway/amqp b4f3ceab0337f013208d31348b578d83c0064744
github.com/stretchr/testify 1f4a1643a57e798696635ea4c126e9127adb7d3c

View File

@@ -1,59 +1,12 @@
github.com/Microsoft/go-winio 9f57cbbcbcb41dea496528872a4f0e37a4f7ae98
github.com/Shopify/sarama 8aadb476e66ca998f2f6bb3c993e9a2daa3666b9
github.com/Sirupsen/logrus 219c8cb75c258c552e999735be6df753ffc7afdc
github.com/Microsoft/go-winio ce2922f643c8fd76b46cadc7f404a06282678b34
github.com/StackExchange/wmi f3e2bae1e0cb5aef83e319133eabfee30013a4a5
github.com/amir/raidman 53c1b967405155bfc8758557863bf2e14f814687
github.com/aws/aws-sdk-go 13a12060f716145019378a10e2806c174356b857
github.com/beorn7/perks 3ac7bf7a47d159a033b107610db8a1b6575507a4
github.com/cenkalti/backoff 4dc77674aceaabba2c7e3da25d4c823edfb73f99
github.com/couchbase/go-couchbase cb664315a324d87d19c879d9cc67fda6be8c2ac1
github.com/couchbase/gomemcached a5ea6356f648fec6ab89add00edd09151455b4b2
github.com/couchbase/goutils 5823a0cbaaa9008406021dc5daf80125ea30bba6
github.com/dancannon/gorethink e7cac92ea2bc52638791a021f212145acfedb1fc
github.com/davecgh/go-spew 5215b55f46b2b919f50a1df0eaa5886afe4e3b3d
github.com/docker/engine-api 8924d6900370b4c7e7984be5adc61f50a80d7537
github.com/docker/go-connections f549a9393d05688dff0992ef3efd8bbe6c628aeb
github.com/docker/go-units 5d2041e26a699eaca682e2ea41c8f891e1060444
github.com/eapache/go-resiliency b86b1ec0dd4209a588dc1285cdd471e73525c0b3
github.com/eapache/queue ded5959c0d4e360646dc9e9908cff48666781367
github.com/eclipse/paho.mqtt.golang 0f7a459f04f13a41b7ed752d47944528d4bf9a86
github.com/go-ole/go-ole 50055884d646dd9434f16bbb5c9801749b9bafe4
github.com/go-sql-driver/mysql 1fca743146605a172a266e1654e01e5cd5669bee
github.com/golang/protobuf 552c7b9542c194800fd493123b3798ef0a832032
github.com/golang/snappy 427fb6fc07997f43afa32f35e850833760e489a7
github.com/gonuts/go-shellquote e842a11b24c6abfb3dd27af69a17f482e4b483c2
github.com/gorilla/context 1ea25387ff6f684839d82767c1733ff4d4d15d0a
github.com/gorilla/mux c9e326e2bdec29039a3761c07bece13133863e1e
github.com/hailocab/go-hostpool e80d13ce29ede4452c43dea11e79b9bc8a15b478
github.com/influxdata/config b79f6829346b8d6e78ba73544b1e1038f1f1c9da
github.com/influxdata/influxdb e3fef5593c21644f2b43af55d6e17e70910b0e48
github.com/influxdata/toml af4df43894b16e3fd2b788d01bd27ad0776ef2d0
github.com/klauspost/crc32 19b0b332c9e4516a6370a0456e6182c3b5036720
github.com/lib/pq e182dc4027e2ded4b19396d638610f2653295f36
github.com/lxn/win 9a7734ea4db26bc593d52f6a8a957afdad39c5c1
github.com/matttproud/golang_protobuf_extensions d0c3fe89de86839aecf2e0579c40ba3bb336a453
github.com/miekg/dns cce6c130cdb92c752850880fd285bea1d64439dd
github.com/mreiferson/go-snappystream 028eae7ab5c4c9e2d1cb4c4ca1e53259bbe7e504
github.com/naoina/go-stringutil 6b638e95a32d0c1131db0e7fe83775cbea4a0d0b
github.com/nats-io/nats b13fc9d12b0b123ebc374e6b808c6228ae4234a3
github.com/nats-io/nuid 4f84f5f3b2786224e336af2e13dba0a0a80b76fa
github.com/nsqio/go-nsq 0b80d6f05e15ca1930e0c5e1d540ed627e299980
github.com/prometheus/client_golang 18acf9993a863f4c4b40612e19cdd243e7c86831
github.com/prometheus/client_model fa8ad6fec33561be4280a8f0514318c79d7f6cb6
github.com/prometheus/common e8eabff8812b05acf522b45fdcd725a785188e37
github.com/prometheus/procfs 406e5b7bfd8201a36e2bb5f7bdae0b03380c2ce8
github.com/samuel/go-zookeeper 218e9c81c0dd8b3b18172b2bbfad92cc7d6db55f
github.com/shirou/gopsutil 1f32ce1bb380845be7f5d174ac641a2c592c0c42
github.com/shirou/w32 ada3ba68f000aa1b58580e45c9d308fe0b7fc5c5
github.com/soniah/gosnmp b1b4f885b12c5dcbd021c5cee1c904110de6db7d
github.com/streadway/amqp b4f3ceab0337f013208d31348b578d83c0064744
github.com/stretchr/testify 1f4a1643a57e798696635ea4c126e9127adb7d3c
github.com/wvanbergen/kafka 46f9a1cf3f670edec492029fadded9c2d9e18866
github.com/wvanbergen/kazoo-go 0f768712ae6f76454f987c3356177e138df258f8
github.com/zensqlmonitor/go-mssqldb ffe5510c6fa5e15e6d983210ab501c815b56b363
golang.org/x/net 6acef71eb69611914f7a30939ea9f6e194c78172
golang.org/x/text a71fd10341b064c10f4a81ceac72bcf70f26ea34
gopkg.in/dancannon/gorethink.v1 7d1af5be49cb5ecc7b177bf387d232050299d6ef
gopkg.in/fatih/pool.v2 cba550ebf9bce999a02e963296d4bc7a486cb715
gopkg.in/mgo.v2 d90005c5262a3463800497ea5a89aed5fe22c886
gopkg.in/yaml.v2 a83829b6f1293c91addabc89d0571c246397bbf4
github.com/go-ole/go-ole be49f7c07711fcb603cff39e1de7c67926dc0ba7
github.com/lxn/win 950a0e81e7678e63d8e6cd32412bdecb325ccd88
github.com/shirou/w32 3c9377fc6748f222729a8270fe2775d149a249ad
golang.org/x/sys a646d33e2ee3172a661fc09bca23bb4889a41bc8
github.com/go-ini/ini 9144852efba7c4daf409943ee90767da62d55438
github.com/jmespath/go-jmespath bd40a432e4c76585ef6b72d3fd96fb9b6dc7b68d
github.com/pmezard/go-difflib/difflib 792786c7400a136282c1664665ae0a8db921c6c2
github.com/stretchr/objx 1a9d0bb9f541897e62256577b352fdbc1fb4fd94
gopkg.in/fsnotify.v1 a8a77c9133d2d6fd8334f3260d06f60e8d80a5fb
gopkg.in/tomb.v1 dd632973f1e7218eb1089048e0798ec9ae7dceb8

View File

@@ -16,7 +16,7 @@ build:
go install -ldflags "-X main.version=$(VERSION)" ./...
build-windows:
go build -o telegraf.exe -ldflags \
GOOS=windows GOARCH=amd64 go build -o telegraf.exe -ldflags \
"-X main.version=$(VERSION)" \
./cmd/telegraf/telegraf.go
@@ -37,6 +37,7 @@ prepare:
# Use the windows godeps file to prepare dependencies
prepare-windows:
go get github.com/sparrc/gdm
gdm restore
gdm restore -f Godeps_windows
# Run all docker containers necessary for unit tests

View File

@@ -156,6 +156,7 @@ Currently implemented sources:
* [exec](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec) (generic executable plugin, support JSON, influx, graphite and nagios)
* [filestat](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/filestat)
* [haproxy](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/haproxy)
* [hddtemp](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/hddtemp)
* [http_response](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/http_response)
* [httpjson](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/httpjson) (generic JSON-emitting http service plugin)
* [influxdb](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/influxdb)
@@ -187,8 +188,9 @@ Currently implemented sources:
* [redis](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/redis)
* [rethinkdb](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/rethinkdb)
* [riak](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/riak)
* [sensors ](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/sensors) (only available if built from source)
* [sensors](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/sensors)
* [snmp](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp)
* [snmp_legacy](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp_legacy)
* [sql server](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/sqlserver) (microsoft)
* [twemproxy](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/twemproxy)
* [varnish](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/varnish)

View File

@@ -16,6 +16,8 @@ type Accumulator interface {
tags map[string]string,
t ...time.Time)
AddError(err error)
Debug() bool
SetDebug(enabled bool)

View File

@@ -4,6 +4,7 @@ import (
"fmt"
"log"
"math"
"sync/atomic"
"time"
"github.com/influxdata/telegraf"
@@ -11,7 +12,7 @@ import (
)
func NewAccumulator(
inputConfig *internal_models.InputConfig,
inputConfig *models.InputConfig,
metrics chan telegraf.Metric,
) *accumulator {
acc := accumulator{}
@@ -30,11 +31,11 @@ type accumulator struct {
// print every point added to the accumulator
trace bool
inputConfig *internal_models.InputConfig
prefix string
inputConfig *models.InputConfig
precision time.Duration
errCount uint64
}
func (ac *accumulator) Add(
@@ -146,10 +147,6 @@ func (ac *accumulator) AddFields(
}
timestamp = timestamp.Round(ac.precision)
if ac.prefix != "" {
measurement = ac.prefix + measurement
}
m, err := telegraf.NewMetric(measurement, tags, result, timestamp)
if err != nil {
log.Printf("Error adding point [%s]: %s\n", measurement, err.Error())
@@ -161,6 +158,17 @@ func (ac *accumulator) AddFields(
ac.metrics <- m
}
// AddError passes a runtime error to the accumulator.
// The error will be tagged with the plugin name and written to the log.
func (ac *accumulator) AddError(err error) {
if err == nil {
return
}
atomic.AddUint64(&ac.errCount, 1)
//TODO suppress/throttle consecutive duplicate errors?
log.Printf("ERROR in input [%s]: %s", ac.inputConfig.Name, err)
}
func (ac *accumulator) Debug() bool {
return ac.debug
}

View File

@@ -1,8 +1,11 @@
package agent
import (
"bytes"
"fmt"
"log"
"math"
"os"
"testing"
"time"
@@ -10,6 +13,7 @@ import (
"github.com/influxdata/telegraf/internal/models"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestAdd(t *testing.T) {
@@ -17,7 +21,7 @@ func TestAdd(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", float64(101), map[string]string{})
a.Add("acctest", float64(101), map[string]string{"acc": "test"})
@@ -43,7 +47,7 @@ func TestAddNoPrecisionWithInterval(t *testing.T) {
now := time.Date(2006, time.February, 10, 12, 0, 0, 82912748, time.UTC)
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.SetPrecision(0, time.Second)
a.Add("acctest", float64(101), map[string]string{})
@@ -70,7 +74,7 @@ func TestAddNoIntervalWithPrecision(t *testing.T) {
now := time.Date(2006, time.February, 10, 12, 0, 0, 82912748, time.UTC)
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.SetPrecision(time.Second, time.Millisecond)
a.Add("acctest", float64(101), map[string]string{})
@@ -97,7 +101,7 @@ func TestAddDisablePrecision(t *testing.T) {
now := time.Date(2006, time.February, 10, 12, 0, 0, 82912748, time.UTC)
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.SetPrecision(time.Second, time.Millisecond)
a.DisablePrecision()
@@ -125,7 +129,7 @@ func TestDifferentPrecisions(t *testing.T) {
now := time.Date(2006, time.February, 10, 12, 0, 0, 82912748, time.UTC)
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.SetPrecision(0, time.Second)
a.Add("acctest", float64(101), map[string]string{"acc": "test"}, now)
@@ -166,7 +170,7 @@ func TestAddDefaultTags(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", float64(101), map[string]string{})
a.Add("acctest", float64(101), map[string]string{"acc": "test"})
@@ -192,7 +196,7 @@ func TestAddFields(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
fields := map[string]interface{}{
"usage": float64(99),
@@ -225,7 +229,7 @@ func TestAddInfFields(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
fields := map[string]interface{}{
"usage": inf,
@@ -253,7 +257,7 @@ func TestAddNaNFields(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
fields := map[string]interface{}{
"usage": nan,
@@ -277,7 +281,7 @@ func TestAddUint64Fields(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
fields := map[string]interface{}{
"usage": uint64(99),
@@ -306,7 +310,7 @@ func TestAddUint64Overflow(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
fields := map[string]interface{}{
"usage": uint64(9223372036854775808),
@@ -336,7 +340,7 @@ func TestAddInts(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", int(101), map[string]string{})
a.Add("acctest", int32(101), map[string]string{"acc": "test"})
@@ -363,7 +367,7 @@ func TestAddFloats(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", float32(101), map[string]string{"acc": "test"})
a.Add("acctest", float64(101), map[string]string{"acc": "test"}, now)
@@ -385,7 +389,7 @@ func TestAddStrings(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", "test", map[string]string{"acc": "test"})
a.Add("acctest", "foo", map[string]string{"acc": "test"}, now)
@@ -407,7 +411,7 @@ func TestAddBools(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.Add("acctest", true, map[string]string{"acc": "test"})
a.Add("acctest", false, map[string]string{"acc": "test"}, now)
@@ -429,11 +433,11 @@ func TestAccFilterTags(t *testing.T) {
now := time.Now()
a.metrics = make(chan telegraf.Metric, 10)
defer close(a.metrics)
filter := internal_models.Filter{
filter := models.Filter{
TagExclude: []string{"acc"},
}
assert.NoError(t, filter.CompileFilter())
a.inputConfig = &internal_models.InputConfig{}
a.inputConfig = &models.InputConfig{}
a.inputConfig.Filter = filter
a.Add("acctest", float64(101), map[string]string{})
@@ -454,3 +458,27 @@ func TestAccFilterTags(t *testing.T) {
fmt.Sprintf("acctest value=101 %d", now.UnixNano()),
actual)
}
func TestAccAddError(t *testing.T) {
errBuf := bytes.NewBuffer(nil)
log.SetOutput(errBuf)
defer log.SetOutput(os.Stderr)
a := accumulator{}
a.inputConfig = &models.InputConfig{}
a.inputConfig.Name = "mock_plugin"
a.AddError(fmt.Errorf("foo"))
a.AddError(fmt.Errorf("bar"))
a.AddError(fmt.Errorf("baz"))
errs := bytes.Split(errBuf.Bytes(), []byte{'\n'})
assert.EqualValues(t, 3, a.errCount)
require.Len(t, errs, 4) // 4 because of trailing newline
assert.Contains(t, string(errs[0]), "mock_plugin")
assert.Contains(t, string(errs[0]), "foo")
assert.Contains(t, string(errs[1]), "mock_plugin")
assert.Contains(t, string(errs[1]), "bar")
assert.Contains(t, string(errs[2]), "mock_plugin")
assert.Contains(t, string(errs[2]), "baz")
}

View File

@@ -88,7 +88,7 @@ func (a *Agent) Close() error {
return err
}
func panicRecover(input *internal_models.RunningInput) {
func panicRecover(input *models.RunningInput) {
if err := recover(); err != nil {
trace := make([]byte, 2048)
runtime.Stack(trace, true)
@@ -104,7 +104,7 @@ func panicRecover(input *internal_models.RunningInput) {
// reporting interval.
func (a *Agent) gatherer(
shutdown chan struct{},
input *internal_models.RunningInput,
input *models.RunningInput,
interval time.Duration,
metricC chan telegraf.Metric,
) error {
@@ -152,7 +152,7 @@ func (a *Agent) gatherer(
// over.
func gatherWithTimeout(
shutdown chan struct{},
input *internal_models.RunningInput,
input *models.RunningInput,
acc *accumulator,
timeout time.Duration,
) {
@@ -215,6 +215,9 @@ func (a *Agent) Test() error {
if err := input.Input.Gather(acc); err != nil {
return err
}
if acc.errCount > 0 {
return fmt.Errorf("Errors encountered during processing")
}
// Special instructions for some inputs. cpu, for example, needs to be
// run twice in order to return cpu usage percentages.
@@ -237,7 +240,7 @@ func (a *Agent) flush() {
wg.Add(len(a.Config.Outputs))
for _, o := range a.Config.Outputs {
go func(output *internal_models.RunningOutput) {
go func(output *models.RunningOutput) {
defer wg.Done()
err := output.Write()
if err != nil {
@@ -348,7 +351,7 @@ func (a *Agent) Run(shutdown chan struct{}) error {
if input.Config.Interval != 0 {
interval = input.Config.Interval
}
go func(in *internal_models.RunningInput, interv time.Duration) {
go func(in *models.RunningInput, interv time.Duration) {
defer wg.Done()
if err := a.gatherer(shutdown, in, interv, metricC); err != nil {
log.Printf(err.Error())

View File

@@ -6,6 +6,7 @@ import (
"log"
"os"
"os/signal"
"runtime"
"strings"
"syscall"
@@ -15,6 +16,7 @@ import (
_ "github.com/influxdata/telegraf/plugins/inputs/all"
"github.com/influxdata/telegraf/plugins/outputs"
_ "github.com/influxdata/telegraf/plugins/outputs/all"
"github.com/kardianos/service"
)
var fDebug = flag.Bool("debug", false,
@@ -39,12 +41,8 @@ var fOutputList = flag.Bool("output-list", false,
"print available output plugins.")
var fUsage = flag.String("usage", "",
"print usage for a plugin, ie, 'telegraf -usage mysql'")
var fInputFiltersLegacy = flag.String("filter", "",
"filter the inputs to enable, separator is :")
var fOutputFiltersLegacy = flag.String("outputfilter", "",
"filter the outputs to enable, separator is :")
var fConfigDirectoryLegacy = flag.String("configdirectory", "",
"directory containing additional *.conf files")
var fService = flag.String("service", "",
"operate on the service")
// Telegraf version, populated linker.
// ie, -ldflags "-X main.version=`git describe --always --tags`"
@@ -74,6 +72,7 @@ The flags are:
-debug print metrics as they're generated to stdout
-quiet run in quiet mode
-version print the version to stdout
-service Control the service, ie, 'telegraf -service install (windows only)'
In addition to the -config flag, telegraf will also load the config file from
an environment variable or default location. Precedence is:
@@ -100,7 +99,22 @@ Examples:
telegraf -config telegraf.conf -input-filter cpu:mem -output-filter influxdb
`
func main() {
var logger service.Logger
var stop chan struct{}
var srvc service.Service
var svcConfig *service.Config
type program struct{}
func reloadLoop(stop chan struct{}, s service.Service) {
defer func() {
if service.Interactive() {
os.Exit(0)
}
return
}()
reload := make(chan bool, 1)
reload <- true
for <-reload {
@@ -110,24 +124,11 @@ func main() {
args := flag.Args()
var inputFilters []string
if *fInputFiltersLegacy != "" {
fmt.Printf("WARNING '--filter' flag is deprecated, please use" +
" '--input-filter'")
inputFilter := strings.TrimSpace(*fInputFiltersLegacy)
inputFilters = strings.Split(":"+inputFilter+":", ":")
}
if *fInputFilters != "" {
inputFilter := strings.TrimSpace(*fInputFilters)
inputFilters = strings.Split(":"+inputFilter+":", ":")
}
var outputFilters []string
if *fOutputFiltersLegacy != "" {
fmt.Printf("WARNING '--outputfilter' flag is deprecated, please use" +
" '--output-filter'")
outputFilter := strings.TrimSpace(*fOutputFiltersLegacy)
outputFilters = strings.Split(":"+outputFilter+":", ":")
}
if *fOutputFilters != "" {
outputFilter := strings.TrimSpace(*fOutputFilters)
outputFilters = strings.Split(":"+outputFilter+":", ":")
@@ -145,40 +146,43 @@ func main() {
}
}
if *fOutputList {
// switch for flags which just do something and exit immediately
switch {
case *fOutputList:
fmt.Println("Available Output Plugins:")
for k, _ := range outputs.Outputs {
fmt.Printf(" %s\n", k)
}
return
}
if *fInputList {
case *fInputList:
fmt.Println("Available Input Plugins:")
for k, _ := range inputs.Inputs {
fmt.Printf(" %s\n", k)
}
return
}
if *fVersion {
case *fVersion:
v := fmt.Sprintf("Telegraf - version %s", version)
fmt.Println(v)
return
}
if *fSampleConfig {
case *fSampleConfig:
config.PrintSampleConfig(inputFilters, outputFilters)
return
}
if *fUsage != "" {
case *fUsage != "":
if err := config.PrintInputConfig(*fUsage); err != nil {
if err2 := config.PrintOutputConfig(*fUsage); err2 != nil {
log.Fatalf("%s and %s", err, err2)
}
}
return
case *fService != "" && runtime.GOOS == "windows":
if *fConfig != "" {
(*svcConfig).Arguments = []string{"-config", *fConfig}
}
err := service.Control(s, *fService)
if err != nil {
log.Fatal(err)
}
return
}
// If no other options are specified, load the config file and run.
@@ -191,15 +195,6 @@ func main() {
os.Exit(1)
}
if *fConfigDirectoryLegacy != "" {
fmt.Printf("WARNING '--configdirectory' flag is deprecated, please use" +
" '--config-directory'")
err = c.LoadDirectory(*fConfigDirectoryLegacy)
if err != nil {
log.Fatal(err)
}
}
if *fConfigDirectory != "" {
err = c.LoadDirectory(*fConfigDirectory)
if err != nil {
@@ -243,14 +238,18 @@ func main() {
signals := make(chan os.Signal)
signal.Notify(signals, os.Interrupt, syscall.SIGHUP)
go func() {
sig := <-signals
if sig == os.Interrupt {
close(shutdown)
}
if sig == syscall.SIGHUP {
log.Printf("Reloading Telegraf config\n")
<-reload
reload <- true
select {
case sig := <-signals:
if sig == os.Interrupt {
close(shutdown)
}
if sig == syscall.SIGHUP {
log.Printf("Reloading Telegraf config\n")
<-reload
reload <- true
close(shutdown)
}
case <-stop:
close(shutdown)
}
}()
@@ -279,3 +278,46 @@ func usageExit(rc int) {
fmt.Println(usage)
os.Exit(rc)
}
func (p *program) Start(s service.Service) error {
srvc = s
go p.run()
return nil
}
func (p *program) run() {
stop = make(chan struct{})
reloadLoop(stop, srvc)
}
func (p *program) Stop(s service.Service) error {
close(stop)
return nil
}
func main() {
if runtime.GOOS == "windows" {
svcConfig = &service.Config{
Name: "telegraf",
DisplayName: "Telegraf Data Collector Service",
Description: "Collects data using a series of plugins and publishes it to" +
"another series of plugins.",
Arguments: []string{"-config", "C:\\Program Files\\Telegraf\\telegraf.conf"},
}
prg := &program{}
s, err := service.New(prg, svcConfig)
if err != nil {
log.Fatal(err)
}
logger, err = s.Logger(nil)
if err != nil {
log.Fatal(err)
}
err = s.Run()
if err != nil {
logger.Error(err)
}
} else {
stop = make(chan struct{})
reloadLoop(stop, nil)
}
}

View File

@@ -16,6 +16,7 @@
- github.com/hashicorp/go-msgpack [BSD LICENSE](https://github.com/hashicorp/go-msgpack/blob/master/LICENSE)
- github.com/hashicorp/raft [MPL LICENSE](https://github.com/hashicorp/raft/blob/master/LICENSE)
- github.com/hashicorp/raft-boltdb [MPL LICENSE](https://github.com/hashicorp/raft-boltdb/blob/master/LICENSE)
- github.com/kardianos/service [ZLIB LICENSE](https://github.com/kardianos/service/blob/master/LICENSE) (License not named but matches word for word with ZLib)
- github.com/lib/pq [MIT LICENSE](https://github.com/lib/pq/blob/master/LICENSE.md)
- github.com/matttproud/golang_protobuf_extensions [APACHE LICENSE](https://github.com/matttproud/golang_protobuf_extensions/blob/master/LICENSE)
- github.com/naoina/go-stringutil [MIT LICENSE](https://github.com/naoina/go-stringutil/blob/master/LICENSE)

View File

@@ -1,36 +1,40 @@
# Running Telegraf as a Windows Service
If you have tried to install Go binaries as Windows Services with the **sc.exe**
tool you may have seen that the service errors and stops running after a while.
Telegraf natively supports running as a Windows Service. Outlined below is are
the general steps to set it up.
**NSSM** (the Non-Sucking Service Manager) is a tool that helps you in a
[number of scenarios](http://nssm.cc/scenarios) including running Go binaries
that were not specifically designed to run only in Windows platforms.
1. Obtain the telegraf windows distribution
2. Create the directory `C:\Program Files\Telegraf` (if you install in a different
location simply specify the `-config` parameter with the desired location)
3. Place the telegraf.exe and the config file into `C:\Program Files\Telegraf`
4. To install the service into the Windows Service Manager, run (as an
administrator):
## NSSM Installation via Chocolatey
```
> C:\Program Files\Telegraf\telegraf.exe --service install
```
You can install [Chocolatey](https://chocolatey.org/) and [NSSM](http://nssm.cc/)
with these commands
5. Edit the configuration file to meet your needs
6. To check that it works, run:
```powershell
iex ((new-object net.webclient).DownloadString('https://chocolatey.org/install.ps1'))
choco install -y nssm
```
```
> C:\Program Files\Telegraf\telegraf.exe --config C:\Program Files\Telegraf\telegraf.conf --test
```
## Installing Telegraf as a Windows Service with NSSM
7. To start collecting data, run:
You can download the latest Telegraf Windows binaries (still Experimental at
the moment) from [the Telegraf Github repo](https://github.com/influxdata/telegraf).
```
> net start telegraf
```
Then you can create a C:\telegraf folder, unzip the binary there and modify the
**telegraf.conf** sample to allocate the metrics you want to send to **InfluxDB**.
## Other supported operations
Once you have NSSM installed in your system, the process is quite straightforward.
You only need to type this command in your Windows shell
Telegraf can manage its own service through the --service flag:
```powershell
nssm install Telegraf c:\telegraf\telegraf.exe -config c:\telegraf\telegraf.config
```
| Command | Effect |
|------------------------------------|-------------------------------|
| `telegraf.exe --service install` | Install telegraf as a service |
| `telegraf.exe --service uninstall` | Remove the telegraf service |
| `telegraf.exe --service start` | Start the telegraf service |
| `telegraf.exe --service stop` | Stop the telegraf service |
And now your service will be installed in Windows and you will be able to start and
stop it gracefully

View File

@@ -55,7 +55,7 @@
## By default, precision will be set to the same timestamp order as the
## collection interval, with the maximum being 1s.
## Precision will NOT be used for service inputs, such as logparser and statsd.
## Valid values are "Nns", "Nus" (or "Nµs"), "Nms", "Ns".
## Valid values are "ns", "us" (or "µs"), "ms", "s".
precision = ""
## Run telegraf in debug mode
debug = false
@@ -83,7 +83,7 @@
## Retention policy to write to. Empty string writes to the default rp.
retention_policy = ""
## Write consistency (clusters only), can be: "any", "one", "quorom", "all"
## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
write_consistency = "any"
## Write timeout (for the InfluxDB client), formatted as a string.
@@ -197,7 +197,7 @@
# # Configuration for Graphite server to send metrics to
# [[outputs.graphite]]
# ## TCP endpoint for your graphite instance.
# ## If multiple endpoints are configured, the output will be load balanced.
# ## If multiple endpoints are configured, output will be load balanced.
# ## Only one of the endpoints will be written to with each iteration.
# servers = ["localhost:2003"]
# ## Prefix metrics name
@@ -321,14 +321,13 @@
# api_token = "my-secret-token" # required.
# ## Debug
# # debug = false
# ## Tag Field to populate source attribute (optional)
# ## This is typically the _hostname_ from which the metric was obtained.
# source_tag = "host"
# ## Connection timeout.
# # timeout = "5s"
# ## Output Name Template (same as graphite buckets)
# ## Output source Template (same as graphite buckets)
# ## see https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md#graphite
# template = "host.tags.measurement.field"
# ## This template is used in librato's source (not metric's name)
# template = "host"
#
# # Configuration for MQTT server to send metrics to
@@ -436,8 +435,8 @@
## disk partitions.
## Setting devices will restrict the stats to the specified devices.
# devices = ["sda", "sdb"]
## Uncomment the following line if you do not need disk serial numbers.
# skip_serial_number = true
## Uncomment the following line if you need disk serial numbers.
# skip_serial_number = false
# Get kernel statistics from /proc/stat
@@ -465,7 +464,7 @@
# no configuration
# # Read stats from an aerospike server
# # Read stats from aerospike server(s)
# [[inputs.aerospike]]
# ## Aerospike servers to connect to (with port)
# ## This plugin will query all namespaces the aerospike
@@ -666,6 +665,13 @@
# container_names = []
# ## Timeout for docker list, info, and stats commands
# timeout = "5s"
#
# ## Whether to report for each container per-device blkio (8:0, 8:1...) and
# ## network (eth0, eth1, ...) stats or not
# perdevice = true
# ## Whether to report for each container total blkio and network stats or not
# total = false
#
# # Read statistics from one or many dovecot servers
@@ -782,9 +788,11 @@
# [[inputs.haproxy]]
# ## An array of address to gather stats about. Specify an ip on hostname
# ## with optional port. ie localhost, 10.10.3.33:1936, etc.
#
# ## If no servers are specified, then default to 127.0.0.1:1936
# servers = ["http://myhaproxy.com:1936", "http://anotherhaproxy.com:1936"]
# ## Make sure you specify the complete path to the stats endpoint
# ## ie 10.10.3.33:1936/haproxy?stats
# #
# ## If no servers are specified, then default to 127.0.0.1:1936/haproxy?stats
# servers = ["http://myhaproxy.com:1936/haproxy?stats"]
# ## Or you can also use local socket
# ## servers = ["socket:/run/haproxy/admin.sock"]
@@ -970,21 +978,35 @@
# # Telegraf plugin for gathering metrics from N Mesos masters
# [[inputs.mesos]]
# # Timeout, in ms.
# ## Timeout, in ms.
# timeout = 100
# # A list of Mesos masters, default value is localhost:5050.
# ## A list of Mesos masters.
# masters = ["localhost:5050"]
# # Metrics groups to be collected, by default, all enabled.
# ## Master metrics groups to be collected, by default, all enabled.
# master_collections = [
# "resources",
# "master",
# "system",
# "slaves",
# "agents",
# "frameworks",
# "tasks",
# "messages",
# "evqueue",
# "registrar",
# ]
# ## A list of Mesos slaves, default is []
# # slaves = []
# ## Slave metrics groups to be collected, by default, all enabled.
# # slave_collections = [
# # "resources",
# # "agent",
# # "system",
# # "executors",
# # "tasks",
# # "messages",
# # ]
# ## Include mesos tasks statistics, default is false
# # slave_tasks = true
# # Read metrics from one or many MongoDB servers
@@ -995,6 +1017,7 @@
# ## mongodb://10.10.3.33:18832,
# ## 10.0.0.1:10000, etc.
# servers = ["127.0.0.1:27017"]
# gather_perdb_stats = false
# # Read metrics from one or many mysql servers
@@ -1101,9 +1124,9 @@
# ## file paths for proc files. If empty default paths will be used:
# ## /proc/net/netstat, /proc/net/snmp, /proc/net/snmp6
# ## These can also be overridden with env variables, see README.
# proc_net_netstat = ""
# proc_net_snmp = ""
# proc_net_snmp6 = ""
# proc_net_netstat = "/proc/net/netstat"
# proc_net_snmp = "/proc/net/snmp"
# proc_net_snmp6 = "/proc/net/snmp6"
# ## dump metrics with 0 values too
# dump_zeros = true
@@ -1127,6 +1150,23 @@
# command = "passenger-status -v --show=xml"
# # Read metrics from one or many pgbouncer servers
# [[inputs.pgbouncer]]
# ## specify address via a url matching:
# ## postgres://[pqgotest[:password]]@localhost:port[/dbname]\
# ## ?sslmode=[disable|verify-ca|verify-full]
# ## or a simple string:
# ## host=localhost user=pqotest port=6432 password=... sslmode=... dbname=pgbouncer
# ##
# ## All connection parameters are optional, except for dbname,
# ## you need to set it always as pgbouncer.
# address = "host=localhost user=postgres port=6432 sslmode=disable dbname=pgbouncer"
#
# ## A list of databases to pull metrics about. If not specified, metrics for all
# ## databases are gathered.
# # databases = ["app_production", "testing"]
# # Read metrics of phpfpm, via HTTP status page or socket
# [[inputs.phpfpm]]
# ## An array of addresses to gather stats about. Specify an ip or hostname
@@ -1305,6 +1345,13 @@
# # username = "guest"
# # password = "guest"
#
# ## Optional SSL Config
# # ssl_ca = "/etc/telegraf/ca.pem"
# # ssl_cert = "/etc/telegraf/cert.pem"
# # ssl_key = "/etc/telegraf/key.pem"
# ## Use SSL but skip chain & host verification
# # insecure_skip_verify = false
#
# ## A list of nodes to pull metrics about. If not specified, metrics for
# ## all nodes are gathered.
# # nodes = ["rabbit@node1", "rabbit@node2"]
@@ -1323,6 +1370,7 @@
# ## e.g.
# ## tcp://localhost:6379
# ## tcp://:password@192.168.99.100
# ## unix:///var/run/redis.sock
# ##
# ## If no servers are specified, then localhost is used as the host.
# ## If no port is specified, 6379 is used
@@ -1345,8 +1393,8 @@
# servers = ["http://localhost:8098"]
# # Reads oids value from one or many snmp agents
# [[inputs.snmp]]
# # DEPRECATED! PLEASE USE inputs.snmp INSTEAD.
# [[inputs.snmp_legacy]]
# ## Use 'oids.txt' file to translate oids to names
# ## To generate 'oids.txt' you need to run:
# ## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
@@ -1545,7 +1593,7 @@
# ## /var/log/**.log -> recursively find all .log files in /var/log
# ## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
# ## /var/log/apache.log -> only tail the apache log file
# files = ["/var/log/influxdb/influxdb.log"]
# files = ["/var/log/apache/access.log"]
# ## Read file from beginning.
# from_beginning = false
#
@@ -1558,7 +1606,9 @@
# ## Other common built-in patterns are:
# ## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs)
# ## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
# patterns = ["%{INFLUXDB_HTTPD_LOG}"]
# patterns = ["%{COMBINED_LOG_FORMAT}"]
# ## Name of the outputted measurement name.
# measurement = "apache_access_log"
# ## Full path(s) to custom pattern files.
# custom_pattern_files = []
# ## Custom patterns can also be defined here. Put one pattern per line.
@@ -1622,6 +1672,21 @@
# data_format = "influx"
# # Read NSQ topic for metrics.
# [[inputs.nsq_consumer]]
# ## An string representing the NSQD TCP Endpoint
# server = "localhost:4150"
# topic = "telegraf"
# channel = "consumer"
# max_in_flight = 100
#
# ## Data format to consume.
# ## Each data format has it's own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
# data_format = "influx"
# # Statsd Server
# [[inputs.statsd]]
# ## Address and port to host UDP listener on
@@ -1725,6 +1790,9 @@
# [inputs.webhooks.github]
# path = "/github"
#
# [inputs.webhooks.mandrill]
# path = "/mandrill"
#
# [inputs.webhooks.rollbar]
# path = "/rollbar"

View File

@@ -9,6 +9,7 @@ import (
"os"
"path/filepath"
"regexp"
"runtime"
"sort"
"strings"
"time"
@@ -47,8 +48,8 @@ type Config struct {
OutputFilters []string
Agent *AgentConfig
Inputs []*internal_models.RunningInput
Outputs []*internal_models.RunningOutput
Inputs []*models.RunningInput
Outputs []*models.RunningOutput
}
func NewConfig() *Config {
@@ -61,8 +62,8 @@ func NewConfig() *Config {
},
Tags: make(map[string]string),
Inputs: make([]*internal_models.RunningInput, 0),
Outputs: make([]*internal_models.RunningOutput, 0),
Inputs: make([]*models.RunningInput, 0),
Outputs: make([]*models.RunningOutput, 0),
InputFilters: make([]string, 0),
OutputFilters: make([]string, 0),
}
@@ -139,7 +140,7 @@ func (c *Config) InputNames() []string {
return name
}
// Outputs returns a list of strings of the configured inputs.
// Outputs returns a list of strings of the configured outputs.
func (c *Config) OutputNames() []string {
var name []string
for _, output := range c.Outputs {
@@ -219,7 +220,7 @@ var header = `# Telegraf Configuration
## By default, precision will be set to the same timestamp order as the
## collection interval, with the maximum being 1s.
## Precision will NOT be used for service inputs, such as logparser and statsd.
## Valid values are "Nns", "Nus" (or "Nµs"), "Nms", "Ns".
## Valid values are "ns", "us" (or "µs"), "ms", "s".
precision = ""
## Run telegraf in debug mode
debug = false
@@ -432,6 +433,9 @@ func getDefaultConfigPath() (string, error) {
envfile := os.Getenv("TELEGRAF_CONFIG_PATH")
homefile := os.ExpandEnv("${HOME}/.telegraf/telegraf.conf")
etcfile := "/etc/telegraf/telegraf.conf"
if runtime.GOOS == "windows" {
etcfile = `C:\Program Files\Telegraf\telegraf.conf`
}
for _, path := range []string{envfile, homefile, etcfile} {
if _, err := os.Stat(path); err == nil {
log.Printf("Using config file: %s", path)
@@ -598,7 +602,7 @@ func (c *Config) addOutput(name string, table *ast.Table) error {
return err
}
ro := internal_models.NewRunningOutput(name, output, outputConfig,
ro := models.NewRunningOutput(name, output, outputConfig,
c.Agent.MetricBatchSize, c.Agent.MetricBufferLimit)
c.Outputs = append(c.Outputs, ro)
return nil
@@ -639,7 +643,7 @@ func (c *Config) addInput(name string, table *ast.Table) error {
return err
}
rp := &internal_models.RunningInput{
rp := &models.RunningInput{
Name: name,
Input: input,
Config: pluginConfig,
@@ -650,10 +654,10 @@ func (c *Config) addInput(name string, table *ast.Table) error {
// buildFilter builds a Filter
// (tagpass/tagdrop/namepass/namedrop/fieldpass/fielddrop) to
// be inserted into the internal_models.OutputConfig/internal_models.InputConfig
// be inserted into the models.OutputConfig/models.InputConfig
// to be used for glob filtering on tags and measurements
func buildFilter(tbl *ast.Table) (internal_models.Filter, error) {
f := internal_models.Filter{}
func buildFilter(tbl *ast.Table) (models.Filter, error) {
f := models.Filter{}
if node, ok := tbl.Fields["namepass"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
@@ -717,7 +721,7 @@ func buildFilter(tbl *ast.Table) (internal_models.Filter, error) {
if subtbl, ok := node.(*ast.Table); ok {
for name, val := range subtbl.Fields {
if kv, ok := val.(*ast.KeyValue); ok {
tagfilter := &internal_models.TagFilter{Name: name}
tagfilter := &models.TagFilter{Name: name}
if ary, ok := kv.Value.(*ast.Array); ok {
for _, elem := range ary.Value {
if str, ok := elem.(*ast.String); ok {
@@ -736,7 +740,7 @@ func buildFilter(tbl *ast.Table) (internal_models.Filter, error) {
if subtbl, ok := node.(*ast.Table); ok {
for name, val := range subtbl.Fields {
if kv, ok := val.(*ast.KeyValue); ok {
tagfilter := &internal_models.TagFilter{Name: name}
tagfilter := &models.TagFilter{Name: name}
if ary, ok := kv.Value.(*ast.Array); ok {
for _, elem := range ary.Value {
if str, ok := elem.(*ast.String); ok {
@@ -793,9 +797,9 @@ func buildFilter(tbl *ast.Table) (internal_models.Filter, error) {
// buildInput parses input specific items from the ast.Table,
// builds the filter and returns a
// internal_models.InputConfig to be inserted into internal_models.RunningInput
func buildInput(name string, tbl *ast.Table) (*internal_models.InputConfig, error) {
cp := &internal_models.InputConfig{Name: name}
// models.InputConfig to be inserted into models.RunningInput
func buildInput(name string, tbl *ast.Table) (*models.InputConfig, error) {
cp := &models.InputConfig{Name: name}
if node, ok := tbl.Fields["interval"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if str, ok := kv.Value.(*ast.String); ok {
@@ -969,14 +973,14 @@ func buildSerializer(name string, tbl *ast.Table) (serializers.Serializer, error
// buildOutput parses output specific items from the ast.Table,
// builds the filter and returns an
// internal_models.OutputConfig to be inserted into internal_models.RunningInput
// models.OutputConfig to be inserted into models.RunningInput
// Note: error exists in the return for future calls that might require error
func buildOutput(name string, tbl *ast.Table) (*internal_models.OutputConfig, error) {
func buildOutput(name string, tbl *ast.Table) (*models.OutputConfig, error) {
filter, err := buildFilter(tbl)
if err != nil {
return nil, err
}
oc := &internal_models.OutputConfig{
oc := &models.OutputConfig{
Name: name,
Filter: filter,
}

View File

@@ -26,19 +26,19 @@ func TestConfig_LoadSingleInputWithEnvVars(t *testing.T) {
memcached := inputs.Inputs["memcached"]().(*memcached.Memcached)
memcached.Servers = []string{"192.168.1.1"}
filter := internal_models.Filter{
filter := models.Filter{
NameDrop: []string{"metricname2"},
NamePass: []string{"metricname1"},
FieldDrop: []string{"other", "stuff"},
FieldPass: []string{"some", "strings"},
TagDrop: []internal_models.TagFilter{
internal_models.TagFilter{
TagDrop: []models.TagFilter{
models.TagFilter{
Name: "badtag",
Filter: []string{"othertag"},
},
},
TagPass: []internal_models.TagFilter{
internal_models.TagFilter{
TagPass: []models.TagFilter{
models.TagFilter{
Name: "goodtag",
Filter: []string{"mytag"},
},
@@ -46,7 +46,7 @@ func TestConfig_LoadSingleInputWithEnvVars(t *testing.T) {
IsActive: true,
}
assert.NoError(t, filter.CompileFilter())
mConfig := &internal_models.InputConfig{
mConfig := &models.InputConfig{
Name: "memcached",
Filter: filter,
Interval: 10 * time.Second,
@@ -66,19 +66,19 @@ func TestConfig_LoadSingleInput(t *testing.T) {
memcached := inputs.Inputs["memcached"]().(*memcached.Memcached)
memcached.Servers = []string{"localhost"}
filter := internal_models.Filter{
filter := models.Filter{
NameDrop: []string{"metricname2"},
NamePass: []string{"metricname1"},
FieldDrop: []string{"other", "stuff"},
FieldPass: []string{"some", "strings"},
TagDrop: []internal_models.TagFilter{
internal_models.TagFilter{
TagDrop: []models.TagFilter{
models.TagFilter{
Name: "badtag",
Filter: []string{"othertag"},
},
},
TagPass: []internal_models.TagFilter{
internal_models.TagFilter{
TagPass: []models.TagFilter{
models.TagFilter{
Name: "goodtag",
Filter: []string{"mytag"},
},
@@ -86,7 +86,7 @@ func TestConfig_LoadSingleInput(t *testing.T) {
IsActive: true,
}
assert.NoError(t, filter.CompileFilter())
mConfig := &internal_models.InputConfig{
mConfig := &models.InputConfig{
Name: "memcached",
Filter: filter,
Interval: 5 * time.Second,
@@ -113,19 +113,19 @@ func TestConfig_LoadDirectory(t *testing.T) {
memcached := inputs.Inputs["memcached"]().(*memcached.Memcached)
memcached.Servers = []string{"localhost"}
filter := internal_models.Filter{
filter := models.Filter{
NameDrop: []string{"metricname2"},
NamePass: []string{"metricname1"},
FieldDrop: []string{"other", "stuff"},
FieldPass: []string{"some", "strings"},
TagDrop: []internal_models.TagFilter{
internal_models.TagFilter{
TagDrop: []models.TagFilter{
models.TagFilter{
Name: "badtag",
Filter: []string{"othertag"},
},
},
TagPass: []internal_models.TagFilter{
internal_models.TagFilter{
TagPass: []models.TagFilter{
models.TagFilter{
Name: "goodtag",
Filter: []string{"mytag"},
},
@@ -133,7 +133,7 @@ func TestConfig_LoadDirectory(t *testing.T) {
IsActive: true,
}
assert.NoError(t, filter.CompileFilter())
mConfig := &internal_models.InputConfig{
mConfig := &models.InputConfig{
Name: "memcached",
Filter: filter,
Interval: 5 * time.Second,
@@ -150,7 +150,7 @@ func TestConfig_LoadDirectory(t *testing.T) {
assert.NoError(t, err)
ex.SetParser(p)
ex.Command = "/usr/bin/myothercollector --foo=bar"
eConfig := &internal_models.InputConfig{
eConfig := &models.InputConfig{
Name: "exec",
MeasurementSuffix: "_myothercollector",
}
@@ -169,7 +169,7 @@ func TestConfig_LoadDirectory(t *testing.T) {
pstat := inputs.Inputs["procstat"]().(*procstat.Procstat)
pstat.PidFile = "/var/run/grafana-server.pid"
pConfig := &internal_models.InputConfig{Name: "procstat"}
pConfig := &models.InputConfig{Name: "procstat"}
pConfig.Tags = make(map[string]string)
assert.Equal(t, pstat, c.Inputs[3].Input,

View File

@@ -1,4 +1,4 @@
package internal_models
package models
import (
"fmt"

View File

@@ -1,4 +1,4 @@
package internal_models
package models
import (
"testing"

View File

@@ -1,4 +1,4 @@
package internal_models
package models
import (
"time"

View File

@@ -1,4 +1,4 @@
package internal_models
package models
import (
"log"

View File

@@ -1,4 +1,4 @@
package internal_models
package models
import (
"fmt"

View File

@@ -27,6 +27,14 @@ The example plugin gathers metrics about example things
- tag2
- measurement2 has the following tags:
- tag3
### Sample Queries:
These are some useful queries (to generate dashboards or other) to run against data from this plugin:
```
SELECT max(field1), mean(field1), min(field1) FROM measurement1 WHERE tag1=bar AND time > now() - 1h GROUP BY tag
```
### Example Output:

File diff suppressed because one or more lines are too long

View File

@@ -72,18 +72,17 @@ func (a *Aerospike) gatherServer(hostport string, acc telegraf.Accumulator) erro
nodes := c.GetNodes()
for _, n := range nodes {
tags := map[string]string{
"node_name": n.GetName(),
"aerospike_host": hostport,
}
fields := make(map[string]interface{})
fields := map[string]interface{}{
"node_name": n.GetName(),
}
stats, err := as.RequestNodeStats(n)
if err != nil {
return err
}
for k, v := range stats {
if iv, err := strconv.ParseInt(v, 10, 64); err == nil {
fields[strings.Replace(k, "-", "_", -1)] = iv
}
fields[strings.Replace(k, "-", "_", -1)] = parseValue(v)
}
acc.AddFields("aerospike_node", fields, tags, time.Now())
@@ -94,9 +93,13 @@ func (a *Aerospike) gatherServer(hostport string, acc telegraf.Accumulator) erro
namespaces := strings.Split(info["namespaces"], ";")
for _, namespace := range namespaces {
nTags := copyTags(tags)
nTags := map[string]string{
"aerospike_host": hostport,
}
nTags["namespace"] = namespace
nFields := make(map[string]interface{})
nFields := map[string]interface{}{
"node_name": n.GetName(),
}
info, err := as.RequestNodeInfo(n, "namespace/"+namespace)
if err != nil {
continue
@@ -107,9 +110,7 @@ func (a *Aerospike) gatherServer(hostport string, acc telegraf.Accumulator) erro
if len(parts) < 2 {
continue
}
if iv, err := strconv.ParseInt(parts[1], 10, 64); err == nil {
nFields[strings.Replace(parts[0], "-", "_", -1)] = iv
}
nFields[strings.Replace(parts[0], "-", "_", -1)] = parseValue(parts[1])
}
acc.AddFields("aerospike_namespace", nFields, nTags, time.Now())
}
@@ -117,6 +118,16 @@ func (a *Aerospike) gatherServer(hostport string, acc telegraf.Accumulator) erro
return nil
}
func parseValue(v string) interface{} {
if parsed, err := strconv.ParseInt(v, 10, 64); err == nil {
return parsed
} else if parsed, err := strconv.ParseBool(v); err == nil {
return parsed
} else {
return v
}
}
func copyTags(m map[string]string) map[string]string {
out := make(map[string]string)
for k, v := range m {

View File

@@ -22,6 +22,7 @@ import (
_ "github.com/influxdata/telegraf/plugins/inputs/filestat"
_ "github.com/influxdata/telegraf/plugins/inputs/graylog"
_ "github.com/influxdata/telegraf/plugins/inputs/haproxy"
_ "github.com/influxdata/telegraf/plugins/inputs/hddtemp"
_ "github.com/influxdata/telegraf/plugins/inputs/http_response"
_ "github.com/influxdata/telegraf/plugins/inputs/httpjson"
_ "github.com/influxdata/telegraf/plugins/inputs/influxdb"
@@ -60,6 +61,7 @@ import (
_ "github.com/influxdata/telegraf/plugins/inputs/riak"
_ "github.com/influxdata/telegraf/plugins/inputs/sensors"
_ "github.com/influxdata/telegraf/plugins/inputs/snmp"
_ "github.com/influxdata/telegraf/plugins/inputs/snmp_legacy"
_ "github.com/influxdata/telegraf/plugins/inputs/sqlserver"
_ "github.com/influxdata/telegraf/plugins/inputs/statsd"
_ "github.com/influxdata/telegraf/plugins/inputs/sysstat"

View File

@@ -1,18 +1,18 @@
# Ceph Storage Input Plugin
Collects performance metrics from the MON and OSD nodes in a Ceph storage cluster.
Collects performance metrics from the MON and OSD nodes in a Ceph storage cluster.
The plugin works by scanning the configured SocketDir for OSD and MON socket files. When it finds
a MON socket, it runs **ceph --admin-daemon $file perfcounters_dump**. For OSDs it runs **ceph --admin-daemon $file perf dump**
a MON socket, it runs **ceph --admin-daemon $file perfcounters_dump**. For OSDs it runs **ceph --admin-daemon $file perf dump**
The resulting JSON is parsed and grouped into collections, based on top-level key. Top-level keys are
used as collection tags, and all sub-keys are flattened. For example:
```
{
"paxos": {
{
"paxos": {
"refresh": 9363435,
"refresh_latency": {
"refresh_latency": {
"avgcount": 9363435,
"sum": 5378.794002000
}
@@ -50,7 +50,7 @@ Would be parsed into the following metrics, all of which would be tagged with co
### Measurements & Fields:
All fields are collected under the **ceph** measurement and stored as float64s. For a full list of fields, see the sample perf dumps in ceph_test.go.
All fields are collected under the **ceph** measurement and stored as float64s. For a full list of fields, see the sample perf dumps in ceph_test.go.
### Tags:
@@ -95,7 +95,7 @@ All measurements will have the following tags:
- throttle-objecter_ops
- throttle-osd_client_bytes
- throttle-osd_client_messages
### Example Output:

View File

@@ -3,12 +3,14 @@ package dns_query
import (
"errors"
"fmt"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
"github.com/miekg/dns"
"net"
"strconv"
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
type DnsQuery struct {
@@ -55,12 +57,12 @@ func (d *DnsQuery) Description() string {
}
func (d *DnsQuery) Gather(acc telegraf.Accumulator) error {
d.setDefaultValues()
errChan := errchan.New(len(d.Domains) * len(d.Servers))
for _, domain := range d.Domains {
for _, server := range d.Servers {
dnsQueryTime, err := d.getDnsQueryTime(domain, server)
if err != nil {
return err
}
errChan.C <- err
tags := map[string]string{
"server": server,
"domain": domain,
@@ -72,7 +74,7 @@ func (d *DnsQuery) Gather(acc telegraf.Accumulator) error {
}
}
return nil
return errChan.Error()
}
func (d *DnsQuery) setDefaultValues() {

View File

@@ -25,6 +25,8 @@ type Docker struct {
Endpoint string
ContainerNames []string
Timeout internal.Duration
PerDevice bool `toml:"perdevice"`
Total bool `toml:"total"`
client DockerClient
}
@@ -58,6 +60,13 @@ var sampleConfig = `
container_names = []
## Timeout for docker list, info, and stats commands
timeout = "5s"
## Whether to report for each container per-device blkio (8:0, 8:1...) and
## network (eth0, eth1, ...) stats or not
perdevice = true
## Whether to report for each container total blkio and network stats or not
total = false
`
// Description returns input description
@@ -207,9 +216,18 @@ func (d *Docker) gatherContainer(
cname = strings.TrimPrefix(container.Names[0], "/")
}
// the image name sometimes has a version part.
// ie, rabbitmq:3-management
imageParts := strings.Split(container.Image, ":")
imageName := imageParts[0]
imageVersion := "unknown"
if len(imageParts) > 1 {
imageVersion = imageParts[1]
}
tags := map[string]string{
"container_name": cname,
"container_image": container.Image,
"container_name": cname,
"container_image": imageName,
"container_version": imageVersion,
}
if len(d.ContainerNames) > 0 {
if !sliceContains(cname, d.ContainerNames) {
@@ -237,7 +255,7 @@ func (d *Docker) gatherContainer(
tags[k] = label
}
gatherContainerStats(v, acc, tags, container.ID)
gatherContainerStats(v, acc, tags, container.ID, d.PerDevice, d.Total)
return nil
}
@@ -247,6 +265,8 @@ func gatherContainerStats(
acc telegraf.Accumulator,
tags map[string]string,
id string,
perDevice bool,
total bool,
) {
now := stat.Read
@@ -314,6 +334,7 @@ func gatherContainerStats(
acc.AddFields("docker_container_cpu", fields, percputags, now)
}
totalNetworkStatMap := make(map[string]interface{})
for network, netstats := range stat.Networks {
netfields := map[string]interface{}{
"rx_dropped": netstats.RxDropped,
@@ -327,12 +348,35 @@ func gatherContainerStats(
"container_id": id,
}
// Create a new network tag dictionary for the "network" tag
nettags := copyTags(tags)
nettags["network"] = network
acc.AddFields("docker_container_net", netfields, nettags, now)
if perDevice {
nettags := copyTags(tags)
nettags["network"] = network
acc.AddFields("docker_container_net", netfields, nettags, now)
}
if total {
for field, value := range netfields {
if field == "container_id" {
continue
}
_, ok := totalNetworkStatMap[field]
if ok {
totalNetworkStatMap[field] = totalNetworkStatMap[field].(uint64) + value.(uint64)
} else {
totalNetworkStatMap[field] = value
}
}
}
}
gatherBlockIOMetrics(stat, acc, tags, now, id)
// totalNetworkStatMap could be empty if container is running with --net=host.
if total && len(totalNetworkStatMap) != 0 {
nettags := copyTags(tags)
nettags["network"] = "total"
totalNetworkStatMap["container_id"] = id
acc.AddFields("docker_container_net", totalNetworkStatMap, nettags, now)
}
gatherBlockIOMetrics(stat, acc, tags, now, id, perDevice, total)
}
func calculateMemPercent(stat *types.StatsJSON) float64 {
@@ -361,6 +405,8 @@ func gatherBlockIOMetrics(
tags map[string]string,
now time.Time,
id string,
perDevice bool,
total bool,
) {
blkioStats := stat.BlkioStats
// Make a map of devices to their block io stats
@@ -422,11 +468,33 @@ func gatherBlockIOMetrics(
deviceStatMap[device]["sectors_recursive"] = metric.Value
}
totalStatMap := make(map[string]interface{})
for device, fields := range deviceStatMap {
iotags := copyTags(tags)
iotags["device"] = device
fields["container_id"] = id
acc.AddFields("docker_container_blkio", fields, iotags, now)
if perDevice {
iotags := copyTags(tags)
iotags["device"] = device
acc.AddFields("docker_container_blkio", fields, iotags, now)
}
if total {
for field, value := range fields {
if field == "container_id" {
continue
}
_, ok := totalStatMap[field]
if ok {
totalStatMap[field] = totalStatMap[field].(uint64) + value.(uint64)
} else {
totalStatMap[field] = value
}
}
}
}
if total {
totalStatMap["container_id"] = id
iotags := copyTags(tags)
iotags["device"] = "total"
acc.AddFields("docker_container_blkio", totalStatMap, iotags, now)
}
}
@@ -471,7 +539,8 @@ func parseSize(sizeStr string) (int64, error) {
func init() {
inputs.Add("docker", func() telegraf.Input {
return &Docker{
Timeout: internal.Duration{Duration: time.Second * 5},
PerDevice: true,
Timeout: internal.Duration{Duration: time.Second * 5},
}
})
}

View File

@@ -24,7 +24,7 @@ func TestDockerGatherContainerStats(t *testing.T) {
"container_name": "redis",
"container_image": "redis/image",
}
gatherContainerStats(stats, &acc, tags, "123456789")
gatherContainerStats(stats, &acc, tags, "123456789", true, true)
// test docker_container_net measurement
netfields := map[string]interface{}{
@@ -42,6 +42,21 @@ func TestDockerGatherContainerStats(t *testing.T) {
nettags["network"] = "eth0"
acc.AssertContainsTaggedFields(t, "docker_container_net", netfields, nettags)
netfields = map[string]interface{}{
"rx_dropped": uint64(6),
"rx_bytes": uint64(8),
"rx_errors": uint64(10),
"tx_packets": uint64(12),
"tx_dropped": uint64(6),
"rx_packets": uint64(8),
"tx_errors": uint64(10),
"tx_bytes": uint64(12),
"container_id": "123456789",
}
nettags = copyTags(tags)
nettags["network"] = "total"
acc.AssertContainsTaggedFields(t, "docker_container_net", netfields, nettags)
// test docker_blkio measurement
blkiotags := copyTags(tags)
blkiotags["device"] = "6:0"
@@ -52,6 +67,15 @@ func TestDockerGatherContainerStats(t *testing.T) {
}
acc.AssertContainsTaggedFields(t, "docker_container_blkio", blkiofields, blkiotags)
blkiotags = copyTags(tags)
blkiotags["device"] = "total"
blkiofields = map[string]interface{}{
"io_service_bytes_recursive_read": uint64(100),
"io_serviced_recursive_write": uint64(302),
"container_id": "123456789",
}
acc.AssertContainsTaggedFields(t, "docker_container_blkio", blkiofields, blkiotags)
// test docker_container_mem measurement
memfields := map[string]interface{}{
"max_usage": uint64(1001),
@@ -186,6 +210,17 @@ func testStats() *types.StatsJSON {
TxBytes: 4,
}
stats.Networks["eth1"] = types.NetworkStats{
RxDropped: 5,
RxBytes: 6,
RxErrors: 7,
TxPackets: 8,
TxDropped: 5,
RxPackets: 6,
TxErrors: 7,
TxBytes: 8,
}
sbr := types.BlkioStatEntry{
Major: 6,
Minor: 0,
@@ -198,11 +233,19 @@ func testStats() *types.StatsJSON {
Op: "write",
Value: 101,
}
sr2 := types.BlkioStatEntry{
Major: 6,
Minor: 1,
Op: "write",
Value: 201,
}
stats.BlkioStats.IoServiceBytesRecursive = append(
stats.BlkioStats.IoServiceBytesRecursive, sbr)
stats.BlkioStats.IoServicedRecursive = append(
stats.BlkioStats.IoServicedRecursive, sr)
stats.BlkioStats.IoServicedRecursive = append(
stats.BlkioStats.IoServicedRecursive, sr2)
return stats
}
@@ -378,9 +421,10 @@ func TestDockerGatherInfo(t *testing.T) {
"container_id": "b7dfbb9478a6ae55e237d4d74f8bbb753f0817192b5081334dc78476296e2173",
},
map[string]string{
"container_name": "etcd2",
"container_image": "quay.io/coreos/etcd:v2.2.2",
"cpu": "cpu3",
"container_name": "etcd2",
"container_image": "quay.io/coreos/etcd",
"cpu": "cpu3",
"container_version": "v2.2.2",
},
)
acc.AssertContainsTaggedFields(t,
@@ -423,8 +467,9 @@ func TestDockerGatherInfo(t *testing.T) {
"container_id": "b7dfbb9478a6ae55e237d4d74f8bbb753f0817192b5081334dc78476296e2173",
},
map[string]string{
"container_name": "etcd2",
"container_image": "quay.io/coreos/etcd:v2.2.2",
"container_name": "etcd2",
"container_image": "quay.io/coreos/etcd",
"container_version": "v2.2.2",
},
)

View File

@@ -12,6 +12,7 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
@@ -51,7 +52,6 @@ const defaultPort = "24242"
// Reads stats from all configured servers.
func (d *Dovecot) Gather(acc telegraf.Accumulator) error {
if !validQuery[d.Type] {
return fmt.Errorf("Error: %s is not a valid query type\n",
d.Type)
@@ -61,31 +61,27 @@ func (d *Dovecot) Gather(acc telegraf.Accumulator) error {
d.Servers = append(d.Servers, "127.0.0.1:24242")
}
var wg sync.WaitGroup
var outerr error
if len(d.Filters) <= 0 {
d.Filters = append(d.Filters, "")
}
for _, serv := range d.Servers {
var wg sync.WaitGroup
errChan := errchan.New(len(d.Servers) * len(d.Filters))
for _, server := range d.Servers {
for _, filter := range d.Filters {
wg.Add(1)
go func(serv string, filter string) {
go func(s string, f string) {
defer wg.Done()
outerr = d.gatherServer(serv, acc, d.Type, filter)
}(serv, filter)
errChan.C <- d.gatherServer(s, acc, d.Type, f)
}(server, filter)
}
}
wg.Wait()
return outerr
return errChan.Error()
}
func (d *Dovecot) gatherServer(addr string, acc telegraf.Accumulator, qtype string, filter string) error {
_, _, err := net.SplitHostPort(addr)
if err != nil {
return fmt.Errorf("Error: %s on url %s\n", err, addr)

View File

@@ -92,9 +92,11 @@ type haproxy struct {
var sampleConfig = `
## An array of address to gather stats about. Specify an ip on hostname
## with optional port. ie localhost, 10.10.3.33:1936, etc.
## If no servers are specified, then default to 127.0.0.1:1936
servers = ["http://myhaproxy.com:1936", "http://anotherhaproxy.com:1936"]
## Make sure you specify the complete path to the stats endpoint
## ie 10.10.3.33:1936/haproxy?stats
#
## If no servers are specified, then default to 127.0.0.1:1936/haproxy?stats
servers = ["http://myhaproxy.com:1936/haproxy?stats"]
## Or you can also use local socket
## servers = ["socket:/run/haproxy/admin.sock"]
`
@@ -111,7 +113,7 @@ func (r *haproxy) Description() string {
// Returns one of the errors encountered while gather stats (if any).
func (g *haproxy) Gather(acc telegraf.Accumulator) error {
if len(g.Servers) == 0 {
return g.gatherServer("http://127.0.0.1:1936", acc)
return g.gatherServer("http://127.0.0.1:1936/haproxy?stats", acc)
}
var wg sync.WaitGroup
@@ -167,12 +169,16 @@ func (g *haproxy) gatherServer(addr string, acc telegraf.Accumulator) error {
g.client = client
}
if !strings.HasSuffix(addr, ";csv") {
addr += "/;csv"
}
u, err := url.Parse(addr)
if err != nil {
return fmt.Errorf("Unable parse server address '%s': %s", addr, err)
}
req, err := http.NewRequest("GET", fmt.Sprintf("%s://%s%s/;csv", u.Scheme, u.Host, u.Path), nil)
req, err := http.NewRequest("GET", addr, nil)
if u.User != nil {
p, _ := u.User.Password()
req.SetBasicAuth(u.User.Username(), p)
@@ -184,7 +190,7 @@ func (g *haproxy) gatherServer(addr string, acc telegraf.Accumulator) error {
}
if res.StatusCode != 200 {
return fmt.Errorf("Unable to get valid stat result from '%s': %s", addr, err)
return fmt.Errorf("Unable to get valid stat result from '%s', http response code : %d", addr, res.StatusCode)
}
return importCsvResult(res.Body, acc, u.Host)

View File

@@ -243,7 +243,7 @@ func TestHaproxyDefaultGetFromLocalhost(t *testing.T) {
err := r.Gather(&acc)
require.Error(t, err)
assert.Contains(t, err.Error(), "127.0.0.1:1936/;csv")
assert.Contains(t, err.Error(), "127.0.0.1:1936/haproxy?stats/;csv")
}
const csvOutputSample = `

View File

@@ -0,0 +1,22 @@
# Hddtemp Input Plugin
This plugin reads data from hddtemp daemon
## Requirements
Hddtemp should be installed and its daemon running
## Configuration
```
[[inputs.hddtemp]]
## By default, telegraf gathers temps data from all disks detected by the
## hddtemp.
##
## Only collect temps from the selected disks.
##
## A * as the device name will return the temperature values of all disks.
##
# address = "127.0.0.1:7634"
# devices = ["sda", "*"]
```

View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2016 Mendelson Gusmão
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,61 @@
package hddtemp
import (
"bytes"
"io"
"net"
"strconv"
"strings"
)
type disk struct {
DeviceName string
Model string
Temperature int32
Unit string
Status string
}
func Fetch(address string) ([]disk, error) {
var (
err error
conn net.Conn
buffer bytes.Buffer
disks []disk
)
if conn, err = net.Dial("tcp", address); err != nil {
return nil, err
}
if _, err = io.Copy(&buffer, conn); err != nil {
return nil, err
}
fields := strings.Split(buffer.String(), "|")
for index := 0; index < len(fields)/5; index++ {
status := ""
offset := index * 5
device := fields[offset+1]
device = device[strings.LastIndex(device, "/")+1:]
temperatureField := fields[offset+3]
temperature, err := strconv.ParseInt(temperatureField, 10, 32)
if err != nil {
temperature = 0
status = temperatureField
}
disks = append(disks, disk{
DeviceName: device,
Model: fields[offset+2],
Temperature: int32(temperature),
Unit: fields[offset+4],
Status: status,
})
}
return disks, nil
}

View File

@@ -0,0 +1,116 @@
package hddtemp
import (
"net"
"reflect"
"testing"
)
func TestFetch(t *testing.T) {
l := serve(t, []byte("|/dev/sda|foobar|36|C|"))
defer l.Close()
disks, err := Fetch(l.Addr().String())
if err != nil {
t.Error("expecting err to be nil")
}
expected := []disk{
{
DeviceName: "sda",
Model: "foobar",
Temperature: 36,
Unit: "C",
},
}
if !reflect.DeepEqual(expected, disks) {
t.Error("disks' slice is different from expected")
}
}
func TestFetchWrongAddress(t *testing.T) {
_, err := Fetch("127.0.0.1:1")
if err == nil {
t.Error("expecting err to be non-nil")
}
}
func TestFetchStatus(t *testing.T) {
l := serve(t, []byte("|/dev/sda|foobar|SLP|C|"))
defer l.Close()
disks, err := Fetch(l.Addr().String())
if err != nil {
t.Error("expecting err to be nil")
}
expected := []disk{
{
DeviceName: "sda",
Model: "foobar",
Temperature: 0,
Unit: "C",
Status: "SLP",
},
}
if !reflect.DeepEqual(expected, disks) {
t.Error("disks' slice is different from expected")
}
}
func TestFetchTwoDisks(t *testing.T) {
l := serve(t, []byte("|/dev/hda|ST380011A|46|C||/dev/hdd|ST340016A|SLP|*|"))
defer l.Close()
disks, err := Fetch(l.Addr().String())
if err != nil {
t.Error("expecting err to be nil")
}
expected := []disk{
{
DeviceName: "hda",
Model: "ST380011A",
Temperature: 46,
Unit: "C",
},
{
DeviceName: "hdd",
Model: "ST340016A",
Temperature: 0,
Unit: "*",
Status: "SLP",
},
}
if !reflect.DeepEqual(expected, disks) {
t.Error("disks' slice is different from expected")
}
}
func serve(t *testing.T, data []byte) net.Listener {
l, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
go func(t *testing.T) {
conn, err := l.Accept()
if err != nil {
t.Fatal(err)
}
conn.Write(data)
conn.Close()
}(t)
return l
}

View File

@@ -0,0 +1,74 @@
// +build linux
package hddtemp
import (
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
gohddtemp "github.com/influxdata/telegraf/plugins/inputs/hddtemp/go-hddtemp"
)
const defaultAddress = "127.0.0.1:7634"
type HDDTemp struct {
Address string
Devices []string
}
func (_ *HDDTemp) Description() string {
return "Monitor disks' temperatures using hddtemp"
}
var hddtempSampleConfig = `
## By default, telegraf gathers temps data from all disks detected by the
## hddtemp.
##
## Only collect temps from the selected disks.
##
## A * as the device name will return the temperature values of all disks.
##
# address = "127.0.0.1:7634"
# devices = ["sda", "*"]
`
func (_ *HDDTemp) SampleConfig() string {
return hddtempSampleConfig
}
func (h *HDDTemp) Gather(acc telegraf.Accumulator) error {
disks, err := gohddtemp.Fetch(h.Address)
if err != nil {
return err
}
for _, disk := range disks {
for _, chosenDevice := range h.Devices {
if chosenDevice == "*" || chosenDevice == disk.DeviceName {
tags := map[string]string{
"device": disk.DeviceName,
"model": disk.Model,
"unit": disk.Unit,
"status": disk.Status,
}
fields := map[string]interface{}{
disk.DeviceName: disk.Temperature,
}
acc.AddFields("hddtemp", fields, tags)
}
}
}
return nil
}
func init() {
inputs.Add("hddtemp", func() telegraf.Input {
return &HDDTemp{
Address: defaultAddress,
Devices: []string{"*"},
}
})
}

View File

@@ -0,0 +1,3 @@
// +build !linux
package hddtemp

View File

@@ -10,11 +10,16 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/plugins/inputs"
)
type InfluxDB struct {
URLs []string `toml:"urls"`
Timeout internal.Duration
client *http.Client
}
func (*InfluxDB) Description() string {
@@ -32,6 +37,9 @@ func (*InfluxDB) SampleConfig() string {
urls = [
"http://localhost:8086/debug/vars"
]
## http request & header timeout
timeout = "5s"
`
}
@@ -39,6 +47,16 @@ func (i *InfluxDB) Gather(acc telegraf.Accumulator) error {
if len(i.URLs) == 0 {
i.URLs = []string{"http://localhost:8086/debug/vars"}
}
if i.client == nil {
i.client = &http.Client{
Transport: &http.Transport{
ResponseHeaderTimeout: i.Timeout.Duration,
},
Timeout: i.Timeout.Duration,
}
}
errorChannel := make(chan error, len(i.URLs))
var wg sync.WaitGroup
@@ -104,15 +122,6 @@ type memstats struct {
GCCPUFraction float64 `json:"GCCPUFraction"`
}
var tr = &http.Transport{
ResponseHeaderTimeout: time.Duration(3 * time.Second),
}
var client = &http.Client{
Transport: tr,
Timeout: time.Duration(4 * time.Second),
}
// Gathers data from a particular URL
// Parameters:
// acc : The telegraf Accumulator to use
@@ -127,7 +136,7 @@ func (i *InfluxDB) gatherURL(
shardCounter := 0
now := time.Now()
resp, err := client.Get(url)
resp, err := i.client.Get(url)
if err != nil {
return err
}
@@ -210,9 +219,13 @@ func (i *InfluxDB) gatherURL(
continue
}
if p.Tags == nil {
p.Tags = make(map[string]string)
}
// If the object was a point, but was not fully initialized,
// ignore it and move on.
if p.Name == "" || p.Tags == nil || p.Values == nil || len(p.Values) == 0 {
if p.Name == "" || p.Values == nil || len(p.Values) == 0 {
continue
}
@@ -244,6 +257,8 @@ func (i *InfluxDB) gatherURL(
func init() {
inputs.Add("influxdb", func() telegraf.Input {
return &InfluxDB{}
return &InfluxDB{
Timeout: internal.Duration{Duration: time.Second * 5},
}
})
}

View File

@@ -116,6 +116,31 @@ func TestInfluxDB(t *testing.T) {
}, map[string]string{})
}
func TestInfluxDB2(t *testing.T) {
fakeInfluxServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/endpoint" {
_, _ = w.Write([]byte(influxReturn2))
} else {
w.WriteHeader(http.StatusNotFound)
}
}))
defer fakeInfluxServer.Close()
plugin := &influxdb.InfluxDB{
URLs: []string{fakeInfluxServer.URL + "/endpoint"},
}
var acc testutil.Accumulator
require.NoError(t, plugin.Gather(&acc))
require.Len(t, acc.Metrics, 34)
acc.AssertContainsTaggedFields(t, "influxdb",
map[string]interface{}{
"n_shards": 1,
}, map[string]string{})
}
func TestErrorHandling(t *testing.T) {
badServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/endpoint" {
@@ -241,3 +266,49 @@ const influxReturn = `
"tsm1_wal:/Users/csparr/.influxdb/wal/udp/default/1": {"name": "tsm1_wal", "tags": {"database": "udp", "path": "/Users/csparr/.influxdb/wal/udp/default/1", "retentionPolicy": "default"}, "values": {"currentSegmentDiskBytes": 193728, "oldSegmentsDiskBytes": 1008330}},
"write": {"name": "write", "tags": {}, "values": {"pointReq": 3613, "pointReqLocal": 3613, "req": 110, "subWriteOk": 110, "writeOk": 110}}
}`
// InfluxDB 1.0+ with tags: null instead of tags: {}.
const influxReturn2 = `
{
"cluster": {"name": "cluster", "tags": null, "values": {}},
"cmdline": ["influxd"],
"cq": {"name": "cq", "tags": null, "values": {}},
"database:_internal": {"name": "database", "tags": {"database": "_internal"}, "values": {"numMeasurements": 8, "numSeries": 12}},
"database:udp": {"name": "database", "tags": {"database": "udp"}, "values": {"numMeasurements": 14, "numSeries": 38}},
"hh:/Users/csparr/.influxdb/hh": {"name": "hh", "tags": {"path": "/Users/csparr/.influxdb/hh"}, "values": {}},
"httpd::8086": {"name": "httpd", "tags": {"bind": ":8086"}, "values": {"req": 7, "reqActive": 1, "reqDurationNs": 4488799}},
"measurement:cpu_idle.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "cpu_idle"}, "values": {"numSeries": 1}},
"measurement:cpu_usage.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "cpu_usage"}, "values": {"numSeries": 1}},
"measurement:database._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "database"}, "values": {"numSeries": 2}},
"measurement:database.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "database"}, "values": {"numSeries": 2}},
"measurement:httpd.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "httpd"}, "values": {"numSeries": 1}},
"measurement:measurement.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "measurement"}, "values": {"numSeries": 22}},
"measurement:mem.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "mem"}, "values": {"numSeries": 1}},
"measurement:net.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "net"}, "values": {"numSeries": 1}},
"measurement:runtime._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "runtime"}, "values": {"numSeries": 1}},
"measurement:runtime.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "runtime"}, "values": {"numSeries": 1}},
"measurement:shard._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "shard"}, "values": {"numSeries": 2}},
"measurement:shard.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "shard"}, "values": {"numSeries": 1}},
"measurement:subscriber._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "subscriber"}, "values": {"numSeries": 1}},
"measurement:subscriber.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "subscriber"}, "values": {"numSeries": 1}},
"measurement:swap_used.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "swap_used"}, "values": {"numSeries": 1}},
"measurement:tsm1_cache._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "tsm1_cache"}, "values": {"numSeries": 2}},
"measurement:tsm1_cache.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "tsm1_cache"}, "values": {"numSeries": 2}},
"measurement:tsm1_wal._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "tsm1_wal"}, "values": {"numSeries": 2}},
"measurement:tsm1_wal.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "tsm1_wal"}, "values": {"numSeries": 2}},
"measurement:udp._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "udp"}, "values": {"numSeries": 1}},
"measurement:write._internal": {"name": "measurement", "tags": {"database": "_internal", "measurement": "write"}, "values": {"numSeries": 1}},
"measurement:write.udp": {"name": "measurement", "tags": {"database": "udp", "measurement": "write"}, "values": {"numSeries": 1}},
"memstats": {"Alloc":17034016,"TotalAlloc":201739016,"Sys":38537464,"Lookups":77,"Mallocs":570251,"Frees":381008,"HeapAlloc":17034016,"HeapSys":33849344,"HeapIdle":15802368,"HeapInuse":18046976,"HeapReleased":3473408,"HeapObjects":189243,"StackInuse":753664,"StackSys":753664,"MSpanInuse":97440,"MSpanSys":114688,"MCacheInuse":4800,"MCacheSys":16384,"BuckHashSys":1461583,"GCSys":1112064,"OtherSys":1229737,"NextGC":20843042,"LastGC":1460434886475114239,"PauseTotalNs":5132914,"PauseNs":[195052,117751,139370,156933,263089,165249,713747,103904,122015,294408,213753,170864,175845,114221,121563,122409,113098,162219,229257,126726,250774,254235,117206,293588,144279,124306,127053,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"PauseEnd":[1460433856394860455,1460433856398162739,1460433856405888337,1460433856411784017,1460433856417924684,1460433856428385687,1460433856443782908,1460433856456522851,1460433857392743223,1460433866484394564,1460433866494076235,1460433896472438632,1460433957839825106,1460433976473440328,1460434016473413006,1460434096471892794,1460434126470792929,1460434246480428250,1460434366554468369,1460434396471249528,1460434456471205885,1460434476479487292,1460434536471435965,1460434616469784776,1460434736482078216,1460434856544251733,1460434886475114239,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NumGC":27,"GCCPUFraction":4.287178819113636e-05,"EnableGC":true,"DebugGC":false,"BySize":[{"Size":0,"Mallocs":0,"Frees":0},{"Size":8,"Mallocs":1031,"Frees":955},{"Size":16,"Mallocs":308485,"Frees":142064},{"Size":32,"Mallocs":64937,"Frees":54321},{"Size":48,"Mallocs":33012,"Frees":29754},{"Size":64,"Mallocs":20299,"Frees":18173},{"Size":80,"Mallocs":8186,"Frees":7597},{"Size":96,"Mallocs":9806,"Frees":8982},{"Size":112,"Mallocs":5671,"Frees":4850},{"Size":128,"Mallocs":2972,"Frees":2684},{"Size":144,"Mallocs":4106,"Frees":3719},{"Size":160,"Mallocs":1324,"Frees":911},{"Size":176,"Mallocs":2574,"Frees":2391},{"Size":192,"Mallocs":4053,"Frees":3863},{"Size":208,"Mallocs":442,"Frees":307},{"Size":224,"Mallocs":336,"Frees":172},{"Size":240,"Mallocs":143,"Frees":125},{"Size":256,"Mallocs":542,"Frees":497},{"Size":288,"Mallocs":15971,"Frees":14761},{"Size":320,"Mallocs":245,"Frees":30},{"Size":352,"Mallocs":1299,"Frees":1065},{"Size":384,"Mallocs":138,"Frees":2},{"Size":416,"Mallocs":54,"Frees":47},{"Size":448,"Mallocs":75,"Frees":29},{"Size":480,"Mallocs":6,"Frees":4},{"Size":512,"Mallocs":452,"Frees":422},{"Size":576,"Mallocs":486,"Frees":395},{"Size":640,"Mallocs":81,"Frees":67},{"Size":704,"Mallocs":421,"Frees":397},{"Size":768,"Mallocs":469,"Frees":468},{"Size":896,"Mallocs":1049,"Frees":1010},{"Size":1024,"Mallocs":1078,"Frees":960},{"Size":1152,"Mallocs":750,"Frees":498},{"Size":1280,"Mallocs":84,"Frees":72},{"Size":1408,"Mallocs":218,"Frees":187},{"Size":1536,"Mallocs":73,"Frees":48},{"Size":1664,"Mallocs":43,"Frees":30},{"Size":2048,"Mallocs":153,"Frees":57},{"Size":2304,"Mallocs":41,"Frees":30},{"Size":2560,"Mallocs":18,"Frees":15},{"Size":2816,"Mallocs":164,"Frees":157},{"Size":3072,"Mallocs":0,"Frees":0},{"Size":3328,"Mallocs":13,"Frees":6},{"Size":4096,"Mallocs":101,"Frees":82},{"Size":4608,"Mallocs":32,"Frees":26},{"Size":5376,"Mallocs":165,"Frees":151},{"Size":6144,"Mallocs":15,"Frees":9},{"Size":6400,"Mallocs":1,"Frees":1},{"Size":6656,"Mallocs":1,"Frees":0},{"Size":6912,"Mallocs":0,"Frees":0},{"Size":8192,"Mallocs":13,"Frees":13},{"Size":8448,"Mallocs":0,"Frees":0},{"Size":8704,"Mallocs":1,"Frees":1},{"Size":9472,"Mallocs":6,"Frees":4},{"Size":10496,"Mallocs":0,"Frees":0},{"Size":12288,"Mallocs":41,"Frees":35},{"Size":13568,"Mallocs":0,"Frees":0},{"Size":14080,"Mallocs":0,"Frees":0},{"Size":16384,"Mallocs":4,"Frees":4},{"Size":16640,"Mallocs":0,"Frees":0},{"Size":17664,"Mallocs":0,"Frees":0}]},
"queryExecutor": {"name": "queryExecutor", "tags": null, "values": {}},
"shard:/Users/csparr/.influxdb/data/_internal/monitor/2:2": {"name": "shard", "tags": {"database": "_internal", "engine": "tsm1", "id": "2", "path": "/Users/csparr/.influxdb/data/_internal/monitor/2", "retentionPolicy": "monitor"}, "values": {}},
"shard:/Users/csparr/.influxdb/data/udp/default/1:1": {"name": "shard", "tags": {"database": "udp", "engine": "tsm1", "id": "1", "path": "/Users/csparr/.influxdb/data/udp/default/1", "retentionPolicy": "default"}, "values": {"fieldsCreate": 61, "seriesCreate": 33, "writePointsOk": 3613, "writeReq": 110}},
"subscriber": {"name": "subscriber", "tags": null, "values": {"pointsWritten": 3613}},
"tsm1_cache:/Users/csparr/.influxdb/data/_internal/monitor/2": {"name": "tsm1_cache", "tags": {"database": "_internal", "path": "/Users/csparr/.influxdb/data/_internal/monitor/2", "retentionPolicy": "monitor"}, "values": {"WALCompactionTimeMs": 0, "cacheAgeMs": 1103932, "cachedBytes": 0, "diskBytes": 0, "memBytes": 40480, "snapshotCount": 0}},
"tsm1_cache:/Users/csparr/.influxdb/data/udp/default/1": {"name": "tsm1_cache", "tags": {"database": "udp", "path": "/Users/csparr/.influxdb/data/udp/default/1", "retentionPolicy": "default"}, "values": {"WALCompactionTimeMs": 0, "cacheAgeMs": 1103029, "cachedBytes": 0, "diskBytes": 0, "memBytes": 2359472, "snapshotCount": 0}},
"tsm1_filestore:/Users/csparr/.influxdb/data/_internal/monitor/2": {"name": "tsm1_filestore", "tags": {"database": "_internal", "path": "/Users/csparr/.influxdb/data/_internal/monitor/2", "retentionPolicy": "monitor"}, "values": {}},
"tsm1_filestore:/Users/csparr/.influxdb/data/udp/default/1": {"name": "tsm1_filestore", "tags": {"database": "udp", "path": "/Users/csparr/.influxdb/data/udp/default/1", "retentionPolicy": "default"}, "values": {}},
"tsm1_wal:/Users/csparr/.influxdb/wal/_internal/monitor/2": {"name": "tsm1_wal", "tags": {"database": "_internal", "path": "/Users/csparr/.influxdb/wal/_internal/monitor/2", "retentionPolicy": "monitor"}, "values": {"currentSegmentDiskBytes": 0, "oldSegmentsDiskBytes": 69532}},
"tsm1_wal:/Users/csparr/.influxdb/wal/udp/default/1": {"name": "tsm1_wal", "tags": {"database": "udp", "path": "/Users/csparr/.influxdb/wal/udp/default/1", "retentionPolicy": "default"}, "values": {"currentSegmentDiskBytes": 193728, "oldSegmentsDiskBytes": 1008330}},
"write": {"name": "write", "tags": null, "values": {"pointReq": 3613, "pointReqLocal": 3613, "req": 110, "subWriteOk": 110, "writeOk": 110}}
}`

View File

@@ -249,7 +249,14 @@ func (j *Jolokia) Gather(acc telegraf.Accumulator) error {
switch t := values.(type) {
case map[string]interface{}:
for k, v := range t {
fields[measurement+"_"+k] = v
switch t2 := v.(type) {
case map[string]interface{}:
for k2, v2 := range t2 {
fields[measurement+"_"+k+"_"+k2] = v2
}
case interface{}:
fields[measurement+"_"+k] = t2
}
}
case interface{}:
fields[measurement] = t

View File

@@ -14,17 +14,22 @@ regex patterns.
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
files = ["/var/log/influxdb/influxdb.log"]
files = ["/var/log/apache/access.log"]
## Read file from beginning.
from_beginning = false
## Parse logstash-style "grok" patterns:
## Telegraf builtin parsing patterns: https://goo.gl/dkay10
## Telegraf built-in parsing patterns: https://goo.gl/dkay10
[inputs.logparser.grok]
## This is a list of patterns to check the given log file(s) for.
## Note that adding patterns here increases processing time. The most
## efficient configuration is to have one file & pattern per logparser.
patterns = ["%{INFLUXDB_HTTPD_LOG}"]
## efficient configuration is to have one pattern per logparser.
## Other common built-in patterns are:
## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs)
## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
patterns = ["%{COMBINED_LOG_FORMAT}"]
## Name of the outputted measurement name.
measurement = "apache_access_log"
## Full path(s) to custom pattern files.
custom_pattern_files = []
## Custom patterns can also be defined here. Put one pattern per line.
@@ -32,8 +37,6 @@ regex patterns.
'''
```
> **Note:** The InfluxDB log pattern in the default configuration only works for Influx versions 1.0.0-beta1 or higher.
## Grok Parser
The grok parser uses a slightly modified version of logstash "grok" patterns,
@@ -69,6 +72,7 @@ Timestamp modifiers can be used to convert captures to the timestamp of the
- tag (converts the field into a tag)
- drop (drops the field completely)
- Timestamp modifiers:
- ts (This will auto-learn the timestamp format)
- ts-ansic ("Mon Jan _2 15:04:05 2006")
- ts-unix ("Mon Jan _2 15:04:05 MST 2006")
- ts-ruby ("Mon Jan 02 15:04:05 -0700 2006")

View File

@@ -15,7 +15,7 @@ import (
"github.com/influxdata/telegraf"
)
var timeFormats = map[string]string{
var timeLayouts = map[string]string{
"ts-ansic": "Mon Jan _2 15:04:05 2006",
"ts-unix": "Mon Jan _2 15:04:05 MST 2006",
"ts-ruby": "Mon Jan 02 15:04:05 -0700 2006",
@@ -27,27 +27,33 @@ var timeFormats = map[string]string{
"ts-rfc3339": "2006-01-02T15:04:05Z07:00",
"ts-rfc3339nano": "2006-01-02T15:04:05.999999999Z07:00",
"ts-httpd": "02/Jan/2006:15:04:05 -0700",
"ts-epoch": "EPOCH",
"ts-epochnano": "EPOCH_NANO",
// These three are not exactly "layouts", but they are special cases that
// will get handled in the ParseLine function.
"ts-epoch": "EPOCH",
"ts-epochnano": "EPOCH_NANO",
"ts": "GENERIC_TIMESTAMP", // try parsing all known timestamp layouts.
}
const (
INT = "int"
TAG = "tag"
FLOAT = "float"
STRING = "string"
DURATION = "duration"
DROP = "drop"
INT = "int"
TAG = "tag"
FLOAT = "float"
STRING = "string"
DURATION = "duration"
DROP = "drop"
EPOCH = "EPOCH"
EPOCH_NANO = "EPOCH_NANO"
GENERIC_TIMESTAMP = "GENERIC_TIMESTAMP"
)
var (
// matches named captures that contain a type.
// matches named captures that contain a modifier.
// ie,
// %{NUMBER:bytes:int}
// %{IPORHOST:clientip:tag}
// %{HTTPDATE:ts1:ts-http}
// %{HTTPDATE:ts2:ts-"02 Jan 06 15:04"}
typedRe = regexp.MustCompile(`%{\w+:(\w+):(ts-".+"|t?s?-?\w+)}`)
modifierRe = regexp.MustCompile(`%{\w+:(\w+):(ts-".+"|t?s?-?\w+)}`)
// matches a plain pattern name. ie, %{NUMBER}
patternOnlyRe = regexp.MustCompile(`%{(\w+)}`)
)
@@ -87,6 +93,12 @@ type Parser struct {
// "RESPONSE_CODE": "%{NUMBER:rc:tag}"
// }
patterns map[string]string
// foundTsLayouts is a slice of timestamp patterns that have been found
// in the log lines. This slice gets updated if the user uses the generic
// 'ts' modifier for timestamps. This slice is checked first for matches,
// so that previously-matched layouts get priority over all other timestamp
// layouts.
foundTsLayouts []string
g *grok.Grok
tsModder *tsModder
@@ -140,6 +152,7 @@ func (p *Parser) Compile() error {
func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
var err error
// values are the parsed fields from the log line
var values map[string]string
// the matching pattern string
var patternName string
@@ -165,6 +178,7 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
continue
}
// t is the modifier of the field
var t string
// check if pattern has some modifiers
if types, ok := p.typeMap[patternName]; ok {
@@ -210,20 +224,50 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
tags[k] = v
case STRING:
fields[k] = strings.Trim(v, `"`)
case "EPOCH":
case EPOCH:
iv, err := strconv.ParseInt(v, 10, 64)
if err != nil {
log.Printf("ERROR parsing %s to int: %s", v, err)
} else {
timestamp = time.Unix(iv, 0)
}
case "EPOCH_NANO":
case EPOCH_NANO:
iv, err := strconv.ParseInt(v, 10, 64)
if err != nil {
log.Printf("ERROR parsing %s to int: %s", v, err)
} else {
timestamp = time.Unix(0, iv)
}
case GENERIC_TIMESTAMP:
var foundTs bool
// first try timestamp layouts that we've already found
for _, layout := range p.foundTsLayouts {
ts, err := time.Parse(layout, v)
if err == nil {
timestamp = ts
foundTs = true
break
}
}
// if we haven't found a timestamp layout yet, try all timestamp
// layouts.
if !foundTs {
for _, layout := range timeLayouts {
ts, err := time.Parse(layout, v)
if err == nil {
timestamp = ts
foundTs = true
p.foundTsLayouts = append(p.foundTsLayouts, layout)
break
}
}
}
// if we still haven't found a timestamp layout, log it and we will
// just use time.Now()
if !foundTs {
log.Printf("ERROR parsing timestamp [%s], could not find any "+
"suitable time layouts.", v)
}
case DROP:
// goodbye!
default:
@@ -267,7 +311,7 @@ func (p *Parser) compileCustomPatterns() error {
// check if pattern contains modifiers. Parse them out if it does.
for name, pattern := range p.patterns {
if typedRe.MatchString(pattern) {
if modifierRe.MatchString(pattern) {
// this pattern has modifiers, so parse out the modifiers
pattern, err = p.parseTypedCaptures(name, pattern)
if err != nil {
@@ -280,13 +324,13 @@ func (p *Parser) compileCustomPatterns() error {
return p.g.AddPatternsFromMap(p.patterns)
}
// parseTypedCaptures parses the capture types, and then deletes the type from
// the line so that it is a valid "grok" pattern again.
// parseTypedCaptures parses the capture modifiers, and then deletes the
// modifier from the line so that it is a valid "grok" pattern again.
// ie,
// %{NUMBER:bytes:int} => %{NUMBER:bytes} (stores %{NUMBER}->bytes->int)
// %{IPORHOST:clientip:tag} => %{IPORHOST:clientip} (stores %{IPORHOST}->clientip->tag)
func (p *Parser) parseTypedCaptures(name, pattern string) (string, error) {
matches := typedRe.FindAllStringSubmatch(pattern, -1)
matches := modifierRe.FindAllStringSubmatch(pattern, -1)
// grab the name of the capture pattern
patternName := "%{" + name + "}"
@@ -298,16 +342,18 @@ func (p *Parser) parseTypedCaptures(name, pattern string) (string, error) {
hasTimestamp := false
for _, match := range matches {
// regex capture 1 is the name of the capture
// regex capture 2 is the type of the capture
if strings.HasPrefix(match[2], "ts-") {
// regex capture 2 is the modifier of the capture
if strings.HasPrefix(match[2], "ts") {
if hasTimestamp {
return pattern, fmt.Errorf("logparser pattern compile error: "+
"Each pattern is allowed only one named "+
"timestamp data type. pattern: %s", pattern)
}
if f, ok := timeFormats[match[2]]; ok {
p.tsMap[patternName][match[1]] = f
if layout, ok := timeLayouts[match[2]]; ok {
// built-in time format
p.tsMap[patternName][match[1]] = layout
} else {
// custom time format
p.tsMap[patternName][match[1]] = strings.TrimSuffix(strings.TrimPrefix(match[2], `ts-"`), `"`)
}
hasTimestamp = true

View File

@@ -38,32 +38,6 @@ func Benchmark_ParseLine_CombinedLogFormat(b *testing.B) {
benchM = m
}
func Benchmark_ParseLine_InfluxLog(b *testing.B) {
p := &Parser{
Patterns: []string{"%{INFLUXDB_HTTPD_LOG}"},
}
p.Compile()
var m telegraf.Metric
for n := 0; n < b.N; n++ {
m, _ = p.ParseLine(`[httpd] 192.168.1.1 - - [14/Jun/2016:11:33:29 +0100] "POST /write?consistency=any&db=telegraf&precision=ns&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 6f61bc44-321b-11e6-8050-000000000000 2513`)
}
benchM = m
}
func Benchmark_ParseLine_InfluxLog_NoMatch(b *testing.B) {
p := &Parser{
Patterns: []string{"%{INFLUXDB_HTTPD_LOG}"},
}
p.Compile()
var m telegraf.Metric
for n := 0; n < b.N; n++ {
m, _ = p.ParseLine(`[retention] 2016/06/14 14:38:24 retention policy shard deletion check commencing`)
}
benchM = m
}
func Benchmark_ParseLine_CustomPattern(b *testing.B) {
p := &Parser{
Patterns: []string{"%{TEST_LOG_A}", "%{TEST_LOG_B}"},
@@ -108,9 +82,9 @@ func TestMeasurementName(t *testing.T) {
assert.Equal(t, "my_web_log", m.Name())
}
func TestBuiltinInfluxdbHttpd(t *testing.T) {
func TestCustomInfluxdbHttpd(t *testing.T) {
p := &Parser{
Patterns: []string{"%{INFLUXDB_HTTPD_LOG}"},
Patterns: []string{`\[httpd\] %{COMBINED_LOG_FORMAT} %{UUID:uuid:drop} %{NUMBER:response_time_us:int}`},
}
assert.NoError(t, p.Compile())
@@ -333,6 +307,55 @@ func TestParseEpochErrors(t *testing.T) {
assert.NoError(t, err)
}
func TestParseGenericTimestamp(t *testing.T) {
p := &Parser{
Patterns: []string{`\[%{HTTPDATE:ts:ts}\] response_time=%{POSINT:response_time:int} mymetric=%{NUMBER:metric:float}`},
}
assert.NoError(t, p.Compile())
metricA, err := p.ParseLine(`[09/Jun/2016:03:37:03 +0000] response_time=20821 mymetric=10890.645`)
require.NotNil(t, metricA)
assert.NoError(t, err)
assert.Equal(t,
map[string]interface{}{
"response_time": int64(20821),
"metric": float64(10890.645),
},
metricA.Fields())
assert.Equal(t, map[string]string{}, metricA.Tags())
assert.Equal(t, time.Unix(1465443423, 0).UTC(), metricA.Time().UTC())
metricB, err := p.ParseLine(`[09/Jun/2016:03:37:04 +0000] response_time=20821 mymetric=10890.645`)
require.NotNil(t, metricB)
assert.NoError(t, err)
assert.Equal(t,
map[string]interface{}{
"response_time": int64(20821),
"metric": float64(10890.645),
},
metricB.Fields())
assert.Equal(t, map[string]string{}, metricB.Tags())
assert.Equal(t, time.Unix(1465443424, 0).UTC(), metricB.Time().UTC())
}
func TestParseGenericTimestampNotFound(t *testing.T) {
p := &Parser{
Patterns: []string{`\[%{NOTSPACE:ts:ts}\] response_time=%{POSINT:response_time:int} mymetric=%{NUMBER:metric:float}`},
}
assert.NoError(t, p.Compile())
metricA, err := p.ParseLine(`[foobar] response_time=20821 mymetric=10890.645`)
require.NotNil(t, metricA)
assert.NoError(t, err)
assert.Equal(t,
map[string]interface{}{
"response_time": int64(20821),
"metric": float64(10890.645),
},
metricA.Fields())
assert.Equal(t, map[string]string{}, metricA.Tags())
}
func TestCompileFileAndParse(t *testing.T) {
p := &Parser{
Patterns: []string{"%{TEST_LOG_A}", "%{TEST_LOG_B}"},

View File

@@ -55,15 +55,13 @@ EXAMPLE_LOG \[%{HTTPDATE:ts:ts-httpd}\] %{NUMBER:myfloat:float} %{RESPONSE_CODE}
# Wider-ranging username matching vs. logstash built-in %{USER}
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
# Wider-ranging client IP matching
CLIENT (?:%{IPORHOST}|%{HOSTPORT}|::1)
##
## COMMON LOG PATTERNS
##
# InfluxDB log patterns
CLIENT (?:%{IPORHOST}|%{HOSTPORT}|::1)
INFLUXDB_HTTPD_LOG \[httpd\] %{COMBINED_LOG_FORMAT} %{UUID:uuid:drop} %{NUMBER:response_time_us:int}
# apache & nginx logs, this is also known as the "common log format"
# see https://en.wikipedia.org/wiki/Common_Log_Format
COMMON_LOG_FORMAT %{CLIENT:client_ip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:ts:ts-httpd}\] "(?:%{WORD:verb:tag} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version:float})?|%{DATA})" %{NUMBER:resp_code:tag} (?:%{NUMBER:resp_bytes:int}|-)

View File

@@ -51,15 +51,13 @@ EXAMPLE_LOG \[%{HTTPDATE:ts:ts-httpd}\] %{NUMBER:myfloat:float} %{RESPONSE_CODE}
# Wider-ranging username matching vs. logstash built-in %{USER}
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
# Wider-ranging client IP matching
CLIENT (?:%{IPORHOST}|%{HOSTPORT}|::1)
##
## COMMON LOG PATTERNS
##
# InfluxDB log patterns
CLIENT (?:%{IPORHOST}|%{HOSTPORT}|::1)
INFLUXDB_HTTPD_LOG \[httpd\] %{COMBINED_LOG_FORMAT} %{UUID:uuid:drop} %{NUMBER:response_time_us:int}
# apache & nginx logs, this is also known as the "common log format"
# see https://en.wikipedia.org/wiki/Common_Log_Format
COMMON_LOG_FORMAT %{CLIENT:client_ip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:ts:ts-httpd}\] "(?:%{WORD:verb:tag} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version:float})?|%{DATA})" %{NUMBER:resp_code:tag} (?:%{NUMBER:resp_bytes:int}|-)

View File

@@ -45,7 +45,7 @@ const sampleConfig = `
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
files = ["/var/log/influxdb/influxdb.log"]
files = ["/var/log/apache/access.log"]
## Read file from beginning.
from_beginning = false
@@ -58,9 +58,9 @@ const sampleConfig = `
## Other common built-in patterns are:
## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs)
## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
patterns = ["%{INFLUXDB_HTTPD_LOG}"]
patterns = ["%{COMBINED_LOG_FORMAT}"]
## Name of the outputted measurement name.
measurement = "influxdb_log"
measurement = "apache_access_log"
## Full path(s) to custom pattern files.
custom_pattern_files = []
## Custom patterns can also be defined here. Put one pattern per line.

View File

@@ -9,6 +9,7 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
@@ -73,19 +74,16 @@ func (m *Memcached) Gather(acc telegraf.Accumulator) error {
return m.gatherServer(":11211", false, acc)
}
errChan := errchan.New(len(m.Servers) + len(m.UnixSockets))
for _, serverAddress := range m.Servers {
if err := m.gatherServer(serverAddress, false, acc); err != nil {
return err
}
errChan.C <- m.gatherServer(serverAddress, false, acc)
}
for _, unixAddress := range m.UnixSockets {
if err := m.gatherServer(unixAddress, true, acc); err != nil {
return err
}
errChan.C <- m.gatherServer(unixAddress, true, acc)
}
return nil
return errChan.Error()
}
func (m *Memcached) gatherServer(

View File

@@ -1,6 +1,6 @@
# Mesos Input Plugin
This input plugin gathers metrics from Mesos (*currently only Mesos masters*).
This input plugin gathers metrics from Mesos.
For more information, please check the [Mesos Observability Metrics](http://mesos.apache.org/documentation/latest/monitoring/) page.
### Configuration:
@@ -8,14 +8,41 @@ For more information, please check the [Mesos Observability Metrics](http://meso
```toml
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
# Timeout, in ms.
## Timeout, in ms.
timeout = 100
# A list of Mesos masters, default value is localhost:5050.
## A list of Mesos masters.
masters = ["localhost:5050"]
# Metrics groups to be collected, by default, all enabled.
master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]
## Master metrics groups to be collected, by default, all enabled.
master_collections = [
"resources",
"master",
"system",
"agents",
"frameworks",
"tasks",
"messages",
"evqueue",
"registrar",
]
## A list of Mesos slaves, default is []
# slaves = []
## Slave metrics groups to be collected, by default, all enabled.
# slave_collections = [
# "resources",
# "agent",
# "system",
# "executors",
# "tasks",
# "messages",
# ]
## Include mesos tasks statistics, default is false
# slave_tasks = true
```
By default this plugin is not configured to gather metrics from mesos. Since a mesos cluster can be deployed in numerous ways it does not provide any default
values. User needs to specify master/slave nodes this plugin will gather metrics from. Additionally, enabling `slave_tasks` will allow
gathering metrics from tasks running on specified slaves (this option is disabled by default).
### Measurements & Fields:
Mesos master metric groups
@@ -33,6 +60,12 @@ Mesos master metric groups
- master/disk_revocable_percent
- master/disk_revocable_total
- master/disk_revocable_used
- master/gpus_percent
- master/gpus_used
- master/gpus_total
- master/gpus_revocable_percent
- master/gpus_revocable_total
- master/gpus_revocable_used
- master/mem_percent
- master/mem_used
- master/mem_total
@@ -136,17 +169,111 @@ Mesos master metric groups
- registrar/state_store_ms/p999
- registrar/state_store_ms/p9999
Mesos slave metric groups
- resources
- slave/cpus_percent
- slave/cpus_used
- slave/cpus_total
- slave/cpus_revocable_percent
- slave/cpus_revocable_total
- slave/cpus_revocable_used
- slave/disk_percent
- slave/disk_used
- slave/disk_total
- slave/disk_revocable_percent
- slave/disk_revocable_total
- slave/disk_revocable_used
- slave/gpus_percent
- slave/gpus_used
- slave/gpus_total,
- slave/gpus_revocable_percent
- slave/gpus_revocable_total
- slave/gpus_revocable_used
- slave/mem_percent
- slave/mem_used
- slave/mem_total
- slave/mem_revocable_percent
- slave/mem_revocable_total
- slave/mem_revocable_used
- agent
- slave/registered
- slave/uptime_secs
- system
- system/cpus_total
- system/load_15min
- system/load_5min
- system/load_1min
- system/mem_free_bytes
- system/mem_total_bytes
- executors
- containerizer/mesos/container_destroy_errors
- slave/container_launch_errors
- slave/executors_preempted
- slave/frameworks_active
- slave/executor_directory_max_allowed_age_secs
- slave/executors_registering
- slave/executors_running
- slave/executors_terminated
- slave/executors_terminating
- slave/recovery_errors
- tasks
- slave/tasks_failed
- slave/tasks_finished
- slave/tasks_killed
- slave/tasks_lost
- slave/tasks_running
- slave/tasks_staging
- slave/tasks_starting
- messages
- slave/invalid_framework_messages
- slave/invalid_status_updates
- slave/valid_framework_messages
- slave/valid_status_updates
Mesos tasks metric groups
- executor_id
- executor_name
- framework_id
- source
- statistics (all metrics below will have `statistics_` prefix included in their names
- cpus_limit
- cpus_system_time_secs
- cpus_user_time_secs
- mem_anon_bytes
- mem_cache_bytes
- mem_critical_pressure_counter
- mem_file_bytes
- mem_limit_bytes
- mem_low_pressure_counter
- mem_mapped_file_bytes
- mem_medium_pressure_counter
- mem_rss_bytes
- mem_swap_bytes
- mem_total_bytes
- mem_total_memsw_bytes
- mem_unevictable_bytes
- timestamp
### Tags:
- All measurements have the following tags:
- All master/slave measurements have the following tags:
- server
- role (master/slave)
- Tasks measurements have the following tags:
- server
### Example Output:
```
$ telegraf -config ~/mesos.conf -input-filter mesos -test
* Plugin: mesos, Collection 1
mesos,server=172.17.8.101 allocator/event_queue_dispatches=0,master/cpus_percent=0,
mesos,host=172.17.8.102,server=172.17.8.101 allocator/event_queue_dispatches=0,master/cpus_percent=0,
master/cpus_revocable_percent=0,master/cpus_revocable_total=0,
master/cpus_revocable_used=0,master/cpus_total=2,
master/cpus_used=0,master/disk_percent=0,master/disk_revocable_percent=0,
@@ -163,3 +290,16 @@ master/mem_revocable_used=0,master/mem_total=1002,
master/mem_used=0,master/messages_authenticate=0,
master/messages_deactivate_framework=0 ...
```
Meoso tasks metrics (if enabled):
```
mesos-tasks,host=172.17.8.102,server=172.17.8.101,task_id=hello-world.e4b5b497-2ccd-11e6-a659-0242fb222ce2
statistics_cpus_limit=0.2,statistics_cpus_system_time_secs=142.49,statistics_cpus_user_time_secs=388.14,
statistics_mem_anon_bytes=359129088,statistics_mem_cache_bytes=3964928,
statistics_mem_critical_pressure_counter=0,statistics_mem_file_bytes=3964928,
statistics_mem_limit_bytes=767557632,statistics_mem_low_pressure_counter=0,
statistics_mem_mapped_file_bytes=114688,statistics_mem_medium_pressure_counter=0,
statistics_mem_rss_bytes=359129088,statistics_mem_swap_bytes=0,statistics_mem_total_bytes=363094016,
statistics_mem_total_memsw_bytes=363094016,statistics_mem_unevictable_bytes=0,
statistics_timestamp=1465486052.70525 1465486053052811792...
```

View File

@@ -17,33 +17,57 @@ import (
jsonparser "github.com/influxdata/telegraf/plugins/parsers/json"
)
type Role string
const (
MASTER Role = "master"
SLAVE = "slave"
)
type Mesos struct {
Timeout int
Masters []string
MasterCols []string `toml:"master_collections"`
Slaves []string
SlaveCols []string `toml:"slave_collections"`
SlaveTasks bool
}
var defaultMetrics = []string{
"resources", "master", "system", "slaves", "frameworks",
"tasks", "messages", "evqueue", "messages", "registrar",
var allMetrics = map[Role][]string{
MASTER: []string{"resources", "master", "system", "agents", "frameworks", "tasks", "messages", "evqueue", "registrar"},
SLAVE: []string{"resources", "agent", "system", "executors", "tasks", "messages"},
}
var sampleConfig = `
# Timeout, in ms.
## Timeout, in ms.
timeout = 100
# A list of Mesos masters, default value is localhost:5050.
## A list of Mesos masters.
masters = ["localhost:5050"]
# Metrics groups to be collected, by default, all enabled.
## Master metrics groups to be collected, by default, all enabled.
master_collections = [
"resources",
"master",
"system",
"slaves",
"agents",
"frameworks",
"tasks",
"messages",
"evqueue",
"registrar",
]
## A list of Mesos slaves, default is []
# slaves = []
## Slave metrics groups to be collected, by default, all enabled.
# slave_collections = [
# "resources",
# "agent",
# "system",
# "executors",
# "tasks",
# "messages",
# ]
## Include mesos tasks statistics, default is false
# slave_tasks = true
`
// SampleConfig returns a sample configuration block
@@ -56,21 +80,54 @@ func (m *Mesos) Description() string {
return "Telegraf plugin for gathering metrics from N Mesos masters"
}
func (m *Mesos) SetDefaults() {
if len(m.MasterCols) == 0 {
m.MasterCols = allMetrics[MASTER]
}
if len(m.SlaveCols) == 0 {
m.SlaveCols = allMetrics[SLAVE]
}
if m.Timeout == 0 {
log.Println("[mesos] Missing timeout value, setting default value (100ms)")
m.Timeout = 100
}
}
// Gather() metrics from given list of Mesos Masters
func (m *Mesos) Gather(acc telegraf.Accumulator) error {
var wg sync.WaitGroup
var errorChannel chan error
if len(m.Masters) == 0 {
m.Masters = []string{"localhost:5050"}
}
m.SetDefaults()
errorChannel = make(chan error, len(m.Masters)*2)
errorChannel = make(chan error, len(m.Masters)+2*len(m.Slaves))
for _, v := range m.Masters {
wg.Add(1)
go func(c string) {
errorChannel <- m.gatherMetrics(c, acc)
errorChannel <- m.gatherMainMetrics(c, ":5050", MASTER, acc)
wg.Done()
return
}(v)
}
for _, v := range m.Slaves {
wg.Add(1)
go func(c string) {
errorChannel <- m.gatherMainMetrics(c, ":5051", MASTER, acc)
wg.Done()
return
}(v)
if !m.SlaveTasks {
continue
}
wg.Add(1)
go func(c string) {
errorChannel <- m.gatherSlaveTaskMetrics(c, ":5051", acc)
wg.Done()
return
}(v)
@@ -94,7 +151,7 @@ func (m *Mesos) Gather(acc telegraf.Accumulator) error {
}
// metricsDiff() returns set names for removal
func metricsDiff(w []string) []string {
func metricsDiff(role Role, w []string) []string {
b := []string{}
s := make(map[string]bool)
@@ -106,7 +163,7 @@ func metricsDiff(w []string) []string {
s[v] = true
}
for _, d := range defaultMetrics {
for _, d := range allMetrics[role] {
if _, ok := s[d]; !ok {
b = append(b, d)
}
@@ -116,156 +173,239 @@ func metricsDiff(w []string) []string {
}
// masterBlocks serves as kind of metrics registry groupping them in sets
func masterBlocks(g string) []string {
func getMetrics(role Role, group string) []string {
var m map[string][]string
m = make(map[string][]string)
m["resources"] = []string{
"master/cpus_percent",
"master/cpus_used",
"master/cpus_total",
"master/cpus_revocable_percent",
"master/cpus_revocable_total",
"master/cpus_revocable_used",
"master/disk_percent",
"master/disk_used",
"master/disk_total",
"master/disk_revocable_percent",
"master/disk_revocable_total",
"master/disk_revocable_used",
"master/mem_percent",
"master/mem_used",
"master/mem_total",
"master/mem_revocable_percent",
"master/mem_revocable_total",
"master/mem_revocable_used",
if role == MASTER {
m["resources"] = []string{
"master/cpus_percent",
"master/cpus_used",
"master/cpus_total",
"master/cpus_revocable_percent",
"master/cpus_revocable_total",
"master/cpus_revocable_used",
"master/disk_percent",
"master/disk_used",
"master/disk_total",
"master/disk_revocable_percent",
"master/disk_revocable_total",
"master/disk_revocable_used",
"master/gpus_percent",
"master/gpus_used",
"master/gpus_total",
"master/gpus_revocable_percent",
"master/gpus_revocable_total",
"master/gpus_revocable_used",
"master/mem_percent",
"master/mem_used",
"master/mem_total",
"master/mem_revocable_percent",
"master/mem_revocable_total",
"master/mem_revocable_used",
}
m["master"] = []string{
"master/elected",
"master/uptime_secs",
}
m["system"] = []string{
"system/cpus_total",
"system/load_15min",
"system/load_5min",
"system/load_1min",
"system/mem_free_bytes",
"system/mem_total_bytes",
}
m["agents"] = []string{
"master/slave_registrations",
"master/slave_removals",
"master/slave_reregistrations",
"master/slave_shutdowns_scheduled",
"master/slave_shutdowns_canceled",
"master/slave_shutdowns_completed",
"master/slaves_active",
"master/slaves_connected",
"master/slaves_disconnected",
"master/slaves_inactive",
}
m["frameworks"] = []string{
"master/frameworks_active",
"master/frameworks_connected",
"master/frameworks_disconnected",
"master/frameworks_inactive",
"master/outstanding_offers",
}
m["tasks"] = []string{
"master/tasks_error",
"master/tasks_failed",
"master/tasks_finished",
"master/tasks_killed",
"master/tasks_lost",
"master/tasks_running",
"master/tasks_staging",
"master/tasks_starting",
}
m["messages"] = []string{
"master/invalid_executor_to_framework_messages",
"master/invalid_framework_to_executor_messages",
"master/invalid_status_update_acknowledgements",
"master/invalid_status_updates",
"master/dropped_messages",
"master/messages_authenticate",
"master/messages_deactivate_framework",
"master/messages_decline_offers",
"master/messages_executor_to_framework",
"master/messages_exited_executor",
"master/messages_framework_to_executor",
"master/messages_kill_task",
"master/messages_launch_tasks",
"master/messages_reconcile_tasks",
"master/messages_register_framework",
"master/messages_register_slave",
"master/messages_reregister_framework",
"master/messages_reregister_slave",
"master/messages_resource_request",
"master/messages_revive_offers",
"master/messages_status_update",
"master/messages_status_update_acknowledgement",
"master/messages_unregister_framework",
"master/messages_unregister_slave",
"master/messages_update_slave",
"master/recovery_slave_removals",
"master/slave_removals/reason_registered",
"master/slave_removals/reason_unhealthy",
"master/slave_removals/reason_unregistered",
"master/valid_framework_to_executor_messages",
"master/valid_status_update_acknowledgements",
"master/valid_status_updates",
"master/task_lost/source_master/reason_invalid_offers",
"master/task_lost/source_master/reason_slave_removed",
"master/task_lost/source_slave/reason_executor_terminated",
"master/valid_executor_to_framework_messages",
}
m["evqueue"] = []string{
"master/event_queue_dispatches",
"master/event_queue_http_requests",
"master/event_queue_messages",
}
m["registrar"] = []string{
"registrar/state_fetch_ms",
"registrar/state_store_ms",
"registrar/state_store_ms/max",
"registrar/state_store_ms/min",
"registrar/state_store_ms/p50",
"registrar/state_store_ms/p90",
"registrar/state_store_ms/p95",
"registrar/state_store_ms/p99",
"registrar/state_store_ms/p999",
"registrar/state_store_ms/p9999",
}
} else if role == SLAVE {
m["resources"] = []string{
"slave/cpus_percent",
"slave/cpus_used",
"slave/cpus_total",
"slave/cpus_revocable_percent",
"slave/cpus_revocable_total",
"slave/cpus_revocable_used",
"slave/disk_percent",
"slave/disk_used",
"slave/disk_total",
"slave/disk_revocable_percent",
"slave/disk_revocable_total",
"slave/disk_revocable_used",
"slave/gpus_percent",
"slave/gpus_used",
"slave/gpus_total",
"slave/gpus_revocable_percent",
"slave/gpus_revocable_total",
"slave/gpus_revocable_used",
"slave/mem_percent",
"slave/mem_used",
"slave/mem_total",
"slave/mem_revocable_percent",
"slave/mem_revocable_total",
"slave/mem_revocable_used",
}
m["agent"] = []string{
"slave/registered",
"slave/uptime_secs",
}
m["system"] = []string{
"system/cpus_total",
"system/load_15min",
"system/load_5min",
"system/load_1min",
"system/mem_free_bytes",
"system/mem_total_bytes",
}
m["executors"] = []string{
"containerizer/mesos/container_destroy_errors",
"slave/container_launch_errors",
"slave/executors_preempted",
"slave/frameworks_active",
"slave/executor_directory_max_allowed_age_secs",
"slave/executors_registering",
"slave/executors_running",
"slave/executors_terminated",
"slave/executors_terminating",
"slave/recovery_errors",
}
m["tasks"] = []string{
"slave/tasks_failed",
"slave/tasks_finished",
"slave/tasks_killed",
"slave/tasks_lost",
"slave/tasks_running",
"slave/tasks_staging",
"slave/tasks_starting",
}
m["messages"] = []string{
"slave/invalid_framework_messages",
"slave/invalid_status_updates",
"slave/valid_framework_messages",
"slave/valid_status_updates",
}
}
m["master"] = []string{
"master/elected",
"master/uptime_secs",
}
m["system"] = []string{
"system/cpus_total",
"system/load_15min",
"system/load_5min",
"system/load_1min",
"system/mem_free_bytes",
"system/mem_total_bytes",
}
m["slaves"] = []string{
"master/slave_registrations",
"master/slave_removals",
"master/slave_reregistrations",
"master/slave_shutdowns_scheduled",
"master/slave_shutdowns_canceled",
"master/slave_shutdowns_completed",
"master/slaves_active",
"master/slaves_connected",
"master/slaves_disconnected",
"master/slaves_inactive",
}
m["frameworks"] = []string{
"master/frameworks_active",
"master/frameworks_connected",
"master/frameworks_disconnected",
"master/frameworks_inactive",
"master/outstanding_offers",
}
m["tasks"] = []string{
"master/tasks_error",
"master/tasks_failed",
"master/tasks_finished",
"master/tasks_killed",
"master/tasks_lost",
"master/tasks_running",
"master/tasks_staging",
"master/tasks_starting",
}
m["messages"] = []string{
"master/invalid_executor_to_framework_messages",
"master/invalid_framework_to_executor_messages",
"master/invalid_status_update_acknowledgements",
"master/invalid_status_updates",
"master/dropped_messages",
"master/messages_authenticate",
"master/messages_deactivate_framework",
"master/messages_decline_offers",
"master/messages_executor_to_framework",
"master/messages_exited_executor",
"master/messages_framework_to_executor",
"master/messages_kill_task",
"master/messages_launch_tasks",
"master/messages_reconcile_tasks",
"master/messages_register_framework",
"master/messages_register_slave",
"master/messages_reregister_framework",
"master/messages_reregister_slave",
"master/messages_resource_request",
"master/messages_revive_offers",
"master/messages_status_update",
"master/messages_status_update_acknowledgement",
"master/messages_unregister_framework",
"master/messages_unregister_slave",
"master/messages_update_slave",
"master/recovery_slave_removals",
"master/slave_removals/reason_registered",
"master/slave_removals/reason_unhealthy",
"master/slave_removals/reason_unregistered",
"master/valid_framework_to_executor_messages",
"master/valid_status_update_acknowledgements",
"master/valid_status_updates",
"master/task_lost/source_master/reason_invalid_offers",
"master/task_lost/source_master/reason_slave_removed",
"master/task_lost/source_slave/reason_executor_terminated",
"master/valid_executor_to_framework_messages",
}
m["evqueue"] = []string{
"master/event_queue_dispatches",
"master/event_queue_http_requests",
"master/event_queue_messages",
}
m["registrar"] = []string{
"registrar/state_fetch_ms",
"registrar/state_store_ms",
"registrar/state_store_ms/max",
"registrar/state_store_ms/min",
"registrar/state_store_ms/p50",
"registrar/state_store_ms/p90",
"registrar/state_store_ms/p95",
"registrar/state_store_ms/p99",
"registrar/state_store_ms/p999",
"registrar/state_store_ms/p9999",
}
ret, ok := m[g]
ret, ok := m[group]
if !ok {
log.Println("[mesos] Unkown metrics group: ", g)
log.Printf("[mesos] Unkown %s metrics group: %s\n", role, group)
return []string{}
}
return ret
}
// removeGroup(), remove unwanted sets
func (m *Mesos) removeGroup(j *map[string]interface{}) {
func (m *Mesos) filterMetrics(role Role, metrics *map[string]interface{}) {
var ok bool
var selectedMetrics []string
b := metricsDiff(m.MasterCols)
if role == MASTER {
selectedMetrics = m.MasterCols
} else if role == SLAVE {
selectedMetrics = m.SlaveCols
}
for _, k := range b {
for _, v := range masterBlocks(k) {
if _, ok = (*j)[v]; ok {
delete((*j), v)
for _, k := range metricsDiff(role, selectedMetrics) {
for _, v := range getMetrics(role, k) {
if _, ok = (*metrics)[v]; ok {
delete((*metrics), v)
}
}
}
@@ -280,23 +420,66 @@ var client = &http.Client{
Timeout: time.Duration(4 * time.Second),
}
// This should not belong to the object
func (m *Mesos) gatherMetrics(a string, acc telegraf.Accumulator) error {
var jsonOut map[string]interface{}
func (m *Mesos) gatherSlaveTaskMetrics(address string, defaultPort string, acc telegraf.Accumulator) error {
var metrics []map[string]interface{}
host, _, err := net.SplitHostPort(a)
host, _, err := net.SplitHostPort(address)
if err != nil {
host = a
a = a + ":5050"
host = address
address = address + defaultPort
}
tags := map[string]string{
"server": host,
}
if m.Timeout == 0 {
log.Println("[mesos] Missing timeout value, setting default value (100ms)")
m.Timeout = 100
ts := strconv.Itoa(m.Timeout) + "ms"
resp, err := client.Get("http://" + address + "/monitor/statistics?timeout=" + ts)
if err != nil {
return err
}
data, err := ioutil.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
return err
}
if err = json.Unmarshal([]byte(data), &metrics); err != nil {
return errors.New("Error decoding JSON response")
}
for _, task := range metrics {
tags["task_id"] = task["executor_id"].(string)
jf := jsonparser.JSONFlattener{}
err = jf.FlattenJSON("", task)
if err != nil {
return err
}
acc.AddFields("mesos-tasks", jf.Fields, tags)
}
return nil
}
// This should not belong to the object
func (m *Mesos) gatherMainMetrics(a string, defaultPort string, role Role, acc telegraf.Accumulator) error {
var jsonOut map[string]interface{}
host, _, err := net.SplitHostPort(a)
if err != nil {
host = a
a = a + defaultPort
}
tags := map[string]string{
"server": host,
"role": string(role),
}
ts := strconv.Itoa(m.Timeout) + "ms"
@@ -317,7 +500,7 @@ func (m *Mesos) gatherMetrics(a string, acc telegraf.Accumulator) error {
return errors.New("Error decoding JSON response")
}
m.removeGroup(&jsonOut)
m.filterMetrics(role, &jsonOut)
jf := jsonparser.JSONFlattener{}

View File

@@ -2,70 +2,275 @@ package mesos
import (
"encoding/json"
"fmt"
"math/rand"
"net/http"
"net/http/httptest"
"os"
"testing"
jsonparser "github.com/influxdata/telegraf/plugins/parsers/json"
"github.com/influxdata/telegraf/testutil"
)
var mesosMetrics map[string]interface{}
var ts *httptest.Server
var masterMetrics map[string]interface{}
var masterTestServer *httptest.Server
var slaveMetrics map[string]interface{}
var slaveTaskMetrics map[string]interface{}
var slaveTestServer *httptest.Server
func randUUID() string {
b := make([]byte, 16)
rand.Read(b)
return fmt.Sprintf("%x-%x-%x-%x-%x", b[0:4], b[4:6], b[6:8], b[8:10], b[10:])
}
func generateMetrics() {
mesosMetrics = make(map[string]interface{})
masterMetrics = make(map[string]interface{})
metricNames := []string{"master/cpus_percent", "master/cpus_used", "master/cpus_total",
"master/cpus_revocable_percent", "master/cpus_revocable_total", "master/cpus_revocable_used",
"master/disk_percent", "master/disk_used", "master/disk_total", "master/disk_revocable_percent",
"master/disk_revocable_total", "master/disk_revocable_used", "master/mem_percent",
"master/mem_used", "master/mem_total", "master/mem_revocable_percent", "master/mem_revocable_total",
"master/mem_revocable_used", "master/elected", "master/uptime_secs", "system/cpus_total",
"system/load_15min", "system/load_5min", "system/load_1min", "system/mem_free_bytes",
"system/mem_total_bytes", "master/slave_registrations", "master/slave_removals",
"master/slave_reregistrations", "master/slave_shutdowns_scheduled", "master/slave_shutdowns_canceled",
"master/slave_shutdowns_completed", "master/slaves_active", "master/slaves_connected",
"master/slaves_disconnected", "master/slaves_inactive", "master/frameworks_active",
"master/frameworks_connected", "master/frameworks_disconnected", "master/frameworks_inactive",
"master/outstanding_offers", "master/tasks_error", "master/tasks_failed", "master/tasks_finished",
"master/tasks_killed", "master/tasks_lost", "master/tasks_running", "master/tasks_staging",
"master/tasks_starting", "master/invalid_executor_to_framework_messages", "master/invalid_framework_to_executor_messages",
"master/invalid_status_update_acknowledgements", "master/invalid_status_updates",
"master/dropped_messages", "master/messages_authenticate", "master/messages_deactivate_framework",
"master/messages_decline_offers", "master/messages_executor_to_framework", "master/messages_exited_executor",
"master/messages_framework_to_executor", "master/messages_kill_task", "master/messages_launch_tasks",
"master/messages_reconcile_tasks", "master/messages_register_framework", "master/messages_register_slave",
"master/messages_reregister_framework", "master/messages_reregister_slave", "master/messages_resource_request",
"master/messages_revive_offers", "master/messages_status_update", "master/messages_status_update_acknowledgement",
"master/messages_unregister_framework", "master/messages_unregister_slave", "master/messages_update_slave",
"master/recovery_slave_removals", "master/slave_removals/reason_registered", "master/slave_removals/reason_unhealthy",
"master/slave_removals/reason_unregistered", "master/valid_framework_to_executor_messages", "master/valid_status_update_acknowledgements",
"master/valid_status_updates", "master/task_lost/source_master/reason_invalid_offers",
"master/task_lost/source_master/reason_slave_removed", "master/task_lost/source_slave/reason_executor_terminated",
"master/valid_executor_to_framework_messages", "master/event_queue_dispatches",
"master/event_queue_http_requests", "master/event_queue_messages", "registrar/state_fetch_ms",
"registrar/state_store_ms", "registrar/state_store_ms/max", "registrar/state_store_ms/min",
"registrar/state_store_ms/p50", "registrar/state_store_ms/p90", "registrar/state_store_ms/p95",
"registrar/state_store_ms/p99", "registrar/state_store_ms/p999", "registrar/state_store_ms/p9999"}
metricNames := []string{
// resources
"master/cpus_percent",
"master/cpus_used",
"master/cpus_total",
"master/cpus_revocable_percent",
"master/cpus_revocable_total",
"master/cpus_revocable_used",
"master/disk_percent",
"master/disk_used",
"master/disk_total",
"master/disk_revocable_percent",
"master/disk_revocable_total",
"master/disk_revocable_used",
"master/gpus_percent",
"master/gpus_used",
"master/gpus_total",
"master/gpus_revocable_percent",
"master/gpus_revocable_total",
"master/gpus_revocable_used",
"master/mem_percent",
"master/mem_used",
"master/mem_total",
"master/mem_revocable_percent",
"master/mem_revocable_total",
"master/mem_revocable_used",
// master
"master/elected",
"master/uptime_secs",
// system
"system/cpus_total",
"system/load_15min",
"system/load_5min",
"system/load_1min",
"system/mem_free_bytes",
"system/mem_total_bytes",
// agents
"master/slave_registrations",
"master/slave_removals",
"master/slave_reregistrations",
"master/slave_shutdowns_scheduled",
"master/slave_shutdowns_canceled",
"master/slave_shutdowns_completed",
"master/slaves_active",
"master/slaves_connected",
"master/slaves_disconnected",
"master/slaves_inactive",
// frameworks
"master/frameworks_active",
"master/frameworks_connected",
"master/frameworks_disconnected",
"master/frameworks_inactive",
"master/outstanding_offers",
// tasks
"master/tasks_error",
"master/tasks_failed",
"master/tasks_finished",
"master/tasks_killed",
"master/tasks_lost",
"master/tasks_running",
"master/tasks_staging",
"master/tasks_starting",
// messages
"master/invalid_executor_to_framework_messages",
"master/invalid_framework_to_executor_messages",
"master/invalid_status_update_acknowledgements",
"master/invalid_status_updates",
"master/dropped_messages",
"master/messages_authenticate",
"master/messages_deactivate_framework",
"master/messages_decline_offers",
"master/messages_executor_to_framework",
"master/messages_exited_executor",
"master/messages_framework_to_executor",
"master/messages_kill_task",
"master/messages_launch_tasks",
"master/messages_reconcile_tasks",
"master/messages_register_framework",
"master/messages_register_slave",
"master/messages_reregister_framework",
"master/messages_reregister_slave",
"master/messages_resource_request",
"master/messages_revive_offers",
"master/messages_status_update",
"master/messages_status_update_acknowledgement",
"master/messages_unregister_framework",
"master/messages_unregister_slave",
"master/messages_update_slave",
"master/recovery_slave_removals",
"master/slave_removals/reason_registered",
"master/slave_removals/reason_unhealthy",
"master/slave_removals/reason_unregistered",
"master/valid_framework_to_executor_messages",
"master/valid_status_update_acknowledgements",
"master/valid_status_updates",
"master/task_lost/source_master/reason_invalid_offers",
"master/task_lost/source_master/reason_slave_removed",
"master/task_lost/source_slave/reason_executor_terminated",
"master/valid_executor_to_framework_messages",
// evgqueue
"master/event_queue_dispatches",
"master/event_queue_http_requests",
"master/event_queue_messages",
// registrar
"registrar/state_fetch_ms",
"registrar/state_store_ms",
"registrar/state_store_ms/max",
"registrar/state_store_ms/min",
"registrar/state_store_ms/p50",
"registrar/state_store_ms/p90",
"registrar/state_store_ms/p95",
"registrar/state_store_ms/p99",
"registrar/state_store_ms/p999",
"registrar/state_store_ms/p9999",
}
for _, k := range metricNames {
mesosMetrics[k] = rand.Float64()
masterMetrics[k] = rand.Float64()
}
slaveMetrics = make(map[string]interface{})
metricNames = []string{
// resources
"slave/cpus_percent",
"slave/cpus_used",
"slave/cpus_total",
"slave/cpus_revocable_percent",
"slave/cpus_revocable_total",
"slave/cpus_revocable_used",
"slave/disk_percent",
"slave/disk_used",
"slave/disk_total",
"slave/disk_revocable_percent",
"slave/disk_revocable_total",
"slave/disk_revocable_used",
"slave/gpus_percent",
"slave/gpus_used",
"slave/gpus_total",
"slave/gpus_revocable_percent",
"slave/gpus_revocable_total",
"slave/gpus_revocable_used",
"slave/mem_percent",
"slave/mem_used",
"slave/mem_total",
"slave/mem_revocable_percent",
"slave/mem_revocable_total",
"slave/mem_revocable_used",
// agent
"slave/registered",
"slave/uptime_secs",
// system
"system/cpus_total",
"system/load_15min",
"system/load_5min",
"system/load_1min",
"system/mem_free_bytes",
"system/mem_total_bytes",
// executors
"containerizer/mesos/container_destroy_errors",
"slave/container_launch_errors",
"slave/executors_preempted",
"slave/frameworks_active",
"slave/executor_directory_max_allowed_age_secs",
"slave/executors_registering",
"slave/executors_running",
"slave/executors_terminated",
"slave/executors_terminating",
"slave/recovery_errors",
// tasks
"slave/tasks_failed",
"slave/tasks_finished",
"slave/tasks_killed",
"slave/tasks_lost",
"slave/tasks_running",
"slave/tasks_staging",
"slave/tasks_starting",
// messages
"slave/invalid_framework_messages",
"slave/invalid_status_updates",
"slave/valid_framework_messages",
"slave/valid_status_updates",
}
for _, k := range metricNames {
slaveMetrics[k] = rand.Float64()
}
slaveTaskMetrics = map[string]interface{}{
"executor_id": fmt.Sprintf("task_%s", randUUID()),
"executor_name": "Some task description",
"framework_id": randUUID(),
"source": fmt.Sprintf("task_source_%s", randUUID()),
"statistics": map[string]interface{}{
"cpus_limit": rand.Float64(),
"cpus_system_time_secs": rand.Float64(),
"cpus_user_time_secs": rand.Float64(),
"mem_anon_bytes": float64(rand.Int63()),
"mem_cache_bytes": float64(rand.Int63()),
"mem_critical_pressure_counter": float64(rand.Int63()),
"mem_file_bytes": float64(rand.Int63()),
"mem_limit_bytes": float64(rand.Int63()),
"mem_low_pressure_counter": float64(rand.Int63()),
"mem_mapped_file_bytes": float64(rand.Int63()),
"mem_medium_pressure_counter": float64(rand.Int63()),
"mem_rss_bytes": float64(rand.Int63()),
"mem_swap_bytes": float64(rand.Int63()),
"mem_total_bytes": float64(rand.Int63()),
"mem_total_memsw_bytes": float64(rand.Int63()),
"mem_unevictable_bytes": float64(rand.Int63()),
"timestamp": rand.Float64(),
},
}
}
func TestMain(m *testing.M) {
generateMetrics()
r := http.NewServeMux()
r.HandleFunc("/metrics/snapshot", func(w http.ResponseWriter, r *http.Request) {
masterRouter := http.NewServeMux()
masterRouter.HandleFunc("/metrics/snapshot", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(mesosMetrics)
json.NewEncoder(w).Encode(masterMetrics)
})
ts = httptest.NewServer(r)
masterTestServer = httptest.NewServer(masterRouter)
slaveRouter := http.NewServeMux()
slaveRouter.HandleFunc("/metrics/snapshot", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(slaveMetrics)
})
slaveRouter.HandleFunc("/monitor/statistics", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode([]map[string]interface{}{slaveTaskMetrics})
})
slaveTestServer = httptest.NewServer(slaveRouter)
rc := m.Run()
ts.Close()
masterTestServer.Close()
slaveTestServer.Close()
os.Exit(rc)
}
@@ -73,7 +278,7 @@ func TestMesosMaster(t *testing.T) {
var acc testutil.Accumulator
m := Mesos{
Masters: []string{ts.Listener.Addr().String()},
Masters: []string{masterTestServer.Listener.Addr().String()},
Timeout: 10,
}
@@ -83,34 +288,88 @@ func TestMesosMaster(t *testing.T) {
t.Errorf(err.Error())
}
acc.AssertContainsFields(t, "mesos", mesosMetrics)
acc.AssertContainsFields(t, "mesos", masterMetrics)
}
func TestRemoveGroup(t *testing.T) {
generateMetrics()
func TestMasterFilter(t *testing.T) {
m := Mesos{
MasterCols: []string{
"resources", "master", "registrar",
},
}
b := []string{
"system", "slaves", "frameworks",
"messages", "evqueue",
"system", "agents", "frameworks",
"messages", "evqueue", "tasks",
}
m.removeGroup(&mesosMetrics)
m.filterMetrics(MASTER, &masterMetrics)
for _, v := range b {
for _, x := range masterBlocks(v) {
if _, ok := mesosMetrics[x]; ok {
for _, x := range getMetrics(MASTER, v) {
if _, ok := masterMetrics[x]; ok {
t.Errorf("Found key %s, it should be gone.", x)
}
}
}
for _, v := range m.MasterCols {
for _, x := range masterBlocks(v) {
if _, ok := mesosMetrics[x]; !ok {
for _, x := range getMetrics(MASTER, v) {
if _, ok := masterMetrics[x]; !ok {
t.Errorf("Didn't find key %s, it should present.", x)
}
}
}
}
func TestMesosSlave(t *testing.T) {
var acc testutil.Accumulator
m := Mesos{
Masters: []string{},
Slaves: []string{slaveTestServer.Listener.Addr().String()},
SlaveTasks: true,
Timeout: 10,
}
err := m.Gather(&acc)
if err != nil {
t.Errorf(err.Error())
}
acc.AssertContainsFields(t, "mesos", slaveMetrics)
jf := jsonparser.JSONFlattener{}
err = jf.FlattenJSON("", slaveTaskMetrics)
if err != nil {
t.Errorf(err.Error())
}
acc.AssertContainsFields(t, "mesos-tasks", jf.Fields)
}
func TestSlaveFilter(t *testing.T) {
m := Mesos{
SlaveCols: []string{
"resources", "agent", "tasks",
},
}
b := []string{
"system", "executors", "messages",
}
m.filterMetrics(SLAVE, &slaveMetrics)
for _, v := range b {
for _, x := range getMetrics(SLAVE, v) {
if _, ok := slaveMetrics[x]; ok {
t.Errorf("Found key %s, it should be gone.", x)
}
}
}
for _, v := range m.MasterCols {
for _, x := range getMetrics(SLAVE, v) {
if _, ok := slaveMetrics[x]; !ok {
t.Errorf("Didn't find key %s, it should present.", x)
}
}

View File

@@ -6,10 +6,22 @@ import (
"github.com/stretchr/testify/mock"
)
// MockPlugin struct should be named the same as the Plugin
type MockPlugin struct {
mock.Mock
}
// Description will appear directly above the plugin definition in the config file
func (m *MockPlugin) Description() string {
return `This is an example plugin`
}
// SampleConfig will populate the sample configuration portion of the plugin's configuration
func (m *MockPlugin) SampleConfig() string {
return ` sampleVar = 'foo'`
}
// Gather defines what data the plugin will gather.
func (m *MockPlugin) Gather(_a0 telegraf.Accumulator) error {
ret := m.Called(_a0)

View File

@@ -10,6 +10,7 @@
## mongodb://10.10.3.33:18832,
## 10.0.0.1:10000, etc.
servers = ["127.0.0.1:27017"]
gather_perdb_stats = false
```
For authenticated mongodb istances use connection mongdb connection URI
@@ -52,3 +53,15 @@ and create a single measurement containing values e.g.
* ttl_passes_per_sec
* repl_lag
* jumbo_chunks (only if mongos or mongo config)
If gather_db_stats is set to true, it will also collect per database stats exposed by db.stats()
creating another measurement called mongodb_db_stats and containing values:
* collections
* objects
* avg_obj_size
* data_size
* storage_size
* num_extents
* indexes
* index_size
* ok

View File

@@ -10,14 +10,16 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
"gopkg.in/mgo.v2"
)
type MongoDB struct {
Servers []string
Ssl Ssl
mongos map[string]*Server
Servers []string
Ssl Ssl
mongos map[string]*Server
GatherPerdbStats bool
}
type Ssl struct {
@@ -32,6 +34,7 @@ var sampleConfig = `
## mongodb://10.10.3.33:18832,
## 10.0.0.1:10000, etc.
servers = ["127.0.0.1:27017"]
gather_perdb_stats = false
`
func (m *MongoDB) SampleConfig() string {
@@ -53,9 +56,7 @@ func (m *MongoDB) Gather(acc telegraf.Accumulator) error {
}
var wg sync.WaitGroup
var outerr error
errChan := errchan.New(len(m.Servers))
for _, serv := range m.Servers {
u, err := url.Parse(serv)
if err != nil {
@@ -71,13 +72,12 @@ func (m *MongoDB) Gather(acc telegraf.Accumulator) error {
wg.Add(1)
go func(srv *Server) {
defer wg.Done()
outerr = m.gatherServer(srv, acc)
errChan.C <- m.gatherServer(srv, acc)
}(m.getMongoServer(u))
}
wg.Wait()
return outerr
return errChan.Error()
}
func (m *MongoDB) getMongoServer(url *url.URL) *Server {
@@ -135,7 +135,7 @@ func (m *MongoDB) gatherServer(server *Server, acc telegraf.Accumulator) error {
}
server.Session = sess
}
return server.gatherData(acc)
return server.gatherData(acc, m.GatherPerdbStats)
}
func init() {

View File

@@ -12,6 +12,12 @@ type MongodbData struct {
StatLine *StatLine
Fields map[string]interface{}
Tags map[string]string
DbData []DbData
}
type DbData struct {
Name string
Fields map[string]interface{}
}
func NewMongodbData(statLine *StatLine, tags map[string]string) *MongodbData {
@@ -22,6 +28,7 @@ func NewMongodbData(statLine *StatLine, tags map[string]string) *MongodbData {
StatLine: statLine,
Tags: tags,
Fields: make(map[string]interface{}),
DbData: []DbData{},
}
}
@@ -72,6 +79,34 @@ var WiredTigerStats = map[string]string{
"percent_cache_used": "CacheUsedPercent",
}
var DbDataStats = map[string]string{
"collections": "Collections",
"objects": "Objects",
"avg_obj_size": "AvgObjSize",
"data_size": "DataSize",
"storage_size": "StorageSize",
"num_extents": "NumExtents",
"indexes": "Indexes",
"index_size": "IndexSize",
"ok": "Ok",
}
func (d *MongodbData) AddDbStats() {
for _, dbstat := range d.StatLine.DbStatsLines {
dbStatLine := reflect.ValueOf(&dbstat).Elem()
newDbData := &DbData{
Name: dbstat.Name,
Fields: make(map[string]interface{}),
}
newDbData.Fields["type"] = "db_stat"
for key, value := range DbDataStats {
val := dbStatLine.FieldByName(value).Interface()
newDbData.Fields[key] = val
}
d.DbData = append(d.DbData, *newDbData)
}
}
func (d *MongodbData) AddDefaultStats() {
statLine := reflect.ValueOf(d.StatLine).Elem()
d.addStat(statLine, DefaultStats)
@@ -113,4 +148,15 @@ func (d *MongodbData) flush(acc telegraf.Accumulator) {
d.StatLine.Time,
)
d.Fields = make(map[string]interface{})
for _, db := range d.DbData {
d.Tags["db_name"] = db.Name
acc.AddFields(
"mongodb_db_stats",
db.Fields,
d.Tags,
d.StatLine.Time,
)
db.Fields = make(map[string]interface{})
}
}

View File

@@ -22,7 +22,7 @@ func (s *Server) getDefaultTags() map[string]string {
return tags
}
func (s *Server) gatherData(acc telegraf.Accumulator) error {
func (s *Server) gatherData(acc telegraf.Accumulator, gatherDbStats bool) error {
s.Session.SetMode(mgo.Eventual, true)
s.Session.SetSocketTimeout(0)
result_server := &ServerStatus{}
@@ -42,10 +42,34 @@ func (s *Server) gatherData(acc telegraf.Accumulator) error {
JumboChunksCount: int64(jumbo_chunks),
}
result_db_stats := &DbStats{}
if gatherDbStats == true {
names := []string{}
names, err = s.Session.DatabaseNames()
if err != nil {
log.Println("Error getting database names (" + err.Error() + ")")
}
for _, db_name := range names {
db_stat_line := &DbStatsData{}
err = s.Session.DB(db_name).Run(bson.D{{"dbStats", 1}}, db_stat_line)
if err != nil {
log.Println("Error getting db stats from " + db_name + "(" + err.Error() + ")")
}
db := &Db{
Name: db_name,
DbStatsData: db_stat_line,
}
result_db_stats.Dbs = append(result_db_stats.Dbs, *db)
}
}
result := &MongoStatus{
ServerStatus: result_server,
ReplSetStatus: result_repl,
ClusterStatus: result_cluster,
DbStats: result_db_stats,
}
defer func() {
@@ -64,6 +88,7 @@ func (s *Server) gatherData(acc telegraf.Accumulator) error {
s.getDefaultTags(),
)
data.AddDefaultStats()
data.AddDbStats()
data.flush(acc)
}
return nil

View File

@@ -29,12 +29,12 @@ func TestGetDefaultTags(t *testing.T) {
func TestAddDefaultStats(t *testing.T) {
var acc testutil.Accumulator
err := server.gatherData(&acc)
err := server.gatherData(&acc, false)
require.NoError(t, err)
time.Sleep(time.Duration(1) * time.Second)
// need to call this twice so it can perform the diff
err = server.gatherData(&acc)
err = server.gatherData(&acc, false)
require.NoError(t, err)
for key, _ := range DefaultStats {

View File

@@ -35,6 +35,7 @@ type MongoStatus struct {
ServerStatus *ServerStatus
ReplSetStatus *ReplSetStatus
ClusterStatus *ClusterStatus
DbStats *DbStats
}
type ServerStatus struct {
@@ -65,6 +66,32 @@ type ServerStatus struct {
Metrics *MetricsStats `bson:"metrics"`
}
// DbStats stores stats from all dbs
type DbStats struct {
Dbs []Db
}
// Db represent a single DB
type Db struct {
Name string
DbStatsData *DbStatsData
}
// DbStatsData stores stats from a db
type DbStatsData struct {
Db string `bson:"db"`
Collections int64 `bson:"collections"`
Objects int64 `bson:"objects"`
AvgObjSize float64 `bson:"avgObjSize"`
DataSize int64 `bson:"dataSize"`
StorageSize int64 `bson:"storageSize"`
NumExtents int64 `bson:"numExtents"`
Indexes int64 `bson:"indexes"`
IndexSize int64 `bson:"indexSize"`
Ok int64 `bson:"ok"`
GleStats interface{} `bson:"gleStats"`
}
// ClusterStatus stores information related to the whole cluster
type ClusterStatus struct {
JumboChunksCount int64
@@ -396,6 +423,22 @@ type StatLine struct {
// Cluster fields
JumboChunksCount int64
// DB stats field
DbStatsLines []DbStatLine
}
type DbStatLine struct {
Name string
Collections int64
Objects int64
AvgObjSize float64
DataSize int64
StorageSize int64
NumExtents int64
Indexes int64
IndexSize int64
Ok int64
}
func parseLocks(stat ServerStatus) map[string]LockUsage {
@@ -677,5 +720,27 @@ func NewStatLine(oldMongo, newMongo MongoStatus, key string, all bool, sampleSec
newClusterStat := *newMongo.ClusterStatus
returnVal.JumboChunksCount = newClusterStat.JumboChunksCount
newDbStats := *newMongo.DbStats
for _, db := range newDbStats.Dbs {
dbStatsData := db.DbStatsData
// mongos doesn't have the db key, so setting the db name
if dbStatsData.Db == "" {
dbStatsData.Db = db.Name
}
dbStatLine := &DbStatLine{
Name: dbStatsData.Db,
Collections: dbStatsData.Collections,
Objects: dbStatsData.Objects,
AvgObjSize: dbStatsData.AvgObjSize,
DataSize: dbStatsData.DataSize,
StorageSize: dbStatsData.StorageSize,
NumExtents: dbStatsData.NumExtents,
Indexes: dbStatsData.Indexes,
IndexSize: dbStatsData.IndexSize,
Ok: dbStatsData.Ok,
}
returnVal.DbStatsLines = append(returnVal.DbStatsLines, *dbStatLine)
}
return returnVal
}

View File

@@ -7,10 +7,12 @@ import (
"net/url"
"strconv"
"strings"
"sync"
"time"
_ "github.com/go-sql-driver/mysql"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
@@ -118,26 +120,27 @@ func (m *Mysql) InitMysql() {
func (m *Mysql) Gather(acc telegraf.Accumulator) error {
if len(m.Servers) == 0 {
// if we can't get stats in this case, thats fine, don't report
// an error.
m.gatherServer(localhost, acc)
return nil
// default to localhost if nothing specified.
return m.gatherServer(localhost, acc)
}
// Initialise additional query intervals
if !initDone {
m.InitMysql()
}
var wg sync.WaitGroup
errChan := errchan.New(len(m.Servers))
// Loop through each server and collect metrics
for _, serv := range m.Servers {
err := m.gatherServer(serv, acc)
if err != nil {
return err
}
for _, server := range m.Servers {
wg.Add(1)
go func(s string) {
defer wg.Done()
errChan.C <- m.gatherServer(s, acc)
}(server)
}
return nil
wg.Wait()
return errChan.Error()
}
type mapping struct {
@@ -1373,6 +1376,7 @@ func (m *Mysql) gatherPerfEventsStatements(db *sql.DB, serv string, acc telegraf
&rowsAffected, &rowsSent, &rowsExamined,
&tmpTables, &tmpDiskTables,
&sortMergePasses, &sortRows,
&noIndexUsed,
)
if err != nil {

View File

@@ -20,7 +20,6 @@ func TestMysqlDefaultsToLocal(t *testing.T) {
}
var acc testutil.Accumulator
err := m.Gather(&acc)
require.NoError(t, err)

View File

@@ -12,6 +12,7 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
@@ -34,7 +35,7 @@ func (n *Nginx) Description() string {
func (n *Nginx) Gather(acc telegraf.Accumulator) error {
var wg sync.WaitGroup
var outerr error
errChan := errchan.New(len(n.Urls))
for _, u := range n.Urls {
addr, err := url.Parse(u)
@@ -45,13 +46,12 @@ func (n *Nginx) Gather(acc telegraf.Accumulator) error {
wg.Add(1)
go func(addr *url.URL) {
defer wg.Done()
outerr = n.gatherUrl(addr, acc)
errChan.C <- n.gatherUrl(addr, acc)
}(addr)
}
wg.Wait()
return outerr
return errChan.Error()
}
var tr = &http.Transport{

View File

@@ -32,6 +32,7 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
@@ -65,19 +66,17 @@ func (n *NSQ) Description() string {
func (n *NSQ) Gather(acc telegraf.Accumulator) error {
var wg sync.WaitGroup
var outerr error
errChan := errchan.New(len(n.Endpoints))
for _, e := range n.Endpoints {
wg.Add(1)
go func(e string) {
defer wg.Done()
outerr = n.gatherEndpoint(e, acc)
errChan.C <- n.gatherEndpoint(e, acc)
}(e)
}
wg.Wait()
return outerr
return errChan.Error()
}
var tr = &http.Transport{

View File

@@ -43,9 +43,9 @@ var sampleConfig = `
## file paths for proc files. If empty default paths will be used:
## /proc/net/netstat, /proc/net/snmp, /proc/net/snmp6
## These can also be overridden with env variables, see README.
proc_net_netstat = ""
proc_net_snmp = ""
proc_net_snmp6 = ""
proc_net_netstat = "/proc/net/netstat"
proc_net_snmp = "/proc/net/snmp"
proc_net_snmp6 = "/proc/net/snmp6"
## dump metrics with 0 values too
dump_zeros = true
`
@@ -141,7 +141,7 @@ func (ns *Nstat) loadPaths() {
ns.ProcNetSNMP = proc(ENV_SNMP, NET_SNMP)
}
if ns.ProcNetSNMP6 == "" {
ns.ProcNetSNMP = proc(ENV_SNMP6, NET_SNMP6)
ns.ProcNetSNMP6 = proc(ENV_SNMP6, NET_SNMP6)
}
}

View File

@@ -119,7 +119,7 @@ func (n *NTPQ) Gather(acc telegraf.Accumulator) error {
// Get integer metrics from output
for key, index := range intI {
if index == -1 {
if index == -1 || index >= len(fields) {
continue
}
if fields[index] == "-" {
@@ -169,7 +169,7 @@ func (n *NTPQ) Gather(acc telegraf.Accumulator) error {
// get float metrics from output
for key, index := range floatI {
if index == -1 {
if index == -1 || index >= len(fields) {
continue
}
if fields[index] == "-" {

View File

@@ -41,6 +41,35 @@ func TestSingleNTPQ(t *testing.T) {
acc.AssertContainsTaggedFields(t, "ntpq", fields, tags)
}
func TestMissingJitterField(t *testing.T) {
tt := tester{
ret: []byte(missingJitterField),
err: nil,
}
n := &NTPQ{
runQ: tt.runqTest,
}
acc := testutil.Accumulator{}
assert.NoError(t, n.Gather(&acc))
fields := map[string]interface{}{
"when": int64(101),
"poll": int64(256),
"reach": int64(37),
"delay": float64(51.016),
"offset": float64(233.010),
}
tags := map[string]string{
"remote": "uschi5-ntp-002.",
"state_prefix": "*",
"refid": "10.177.80.46",
"stratum": "2",
"type": "u",
}
acc.AssertContainsTaggedFields(t, "ntpq", fields, tags)
}
func TestBadIntNTPQ(t *testing.T) {
tt := tester{
ret: []byte(badIntParseNTPQ),
@@ -381,6 +410,11 @@ var singleNTPQ = ` remote refid st t when poll reach delay
*uschi5-ntp-002. 10.177.80.46 2 u 101 256 37 51.016 233.010 17.462
`
var missingJitterField = ` remote refid st t when poll reach delay offset jitter
==============================================================================
*uschi5-ntp-002. 10.177.80.46 2 u 101 256 37 51.016 233.010
`
var badHeaderNTPQ = `remote refid foobar t when poll reach delay offset jitter
==============================================================================
*uschi5-ntp-002. 10.177.80.46 2 u 101 256 37 51.016 233.010 17.462

View File

@@ -0,0 +1,36 @@
# Ping input plugin
This input plugin will measures the round-trip
## Windows:
### Configration:
```
## urls to ping
urls = ["www.google.com"] # required
## number of pings to send per collection (ping -n <COUNT>)
count = 4 # required
## Ping timeout, in seconds. 0 means default timeout (ping -w <TIMEOUT>)
Timeout = 0
```
### Measurements & Fields:
- packets_transmitted ( from ping output )
- reply_received ( increasing only on valid metric from echo replay, eg. 'Destination net unreachable' reply will increment packets_received but not reply_received )
- packets_received ( from ping output )
- percent_reply_loss ( compute from packets_transmitted and reply_received )
- percent_packets_loss ( compute from packets_transmitted and packets_received )
- errors ( when host can not be found or wrong prameters is passed to application )
- response time
- average_response_ms ( compute from minimum_response_ms and maximum_response_ms )
- minimum_response_ms ( from ping output )
- maximum_response_ms ( from ping output )
### Tags:
- server
### Example Output:
```
* Plugin: ping, Collection 1
ping,host=WIN-PBAPLP511R7,url=www.google.com average_response_ms=7i,maximum_response_ms=9i,minimum_response_ms=7i,packets_received=4i,packets_transmitted=4i,percent_packet_loss=0,percent_reply_loss=0,reply_received=4i 1469879119000000000
```

View File

@@ -1,3 +1,223 @@
// +build windows
package ping
import (
"errors"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/plugins/inputs"
"os/exec"
"regexp"
"strconv"
"strings"
"sync"
"time"
)
// HostPinger is a function that runs the "ping" function using a list of
// passed arguments. This can be easily switched with a mocked ping function
// for unit test purposes (see ping_test.go)
type HostPinger func(timeout float64, args ...string) (string, error)
type Ping struct {
// Number of pings to send (ping -c <COUNT>)
Count int
// Ping timeout, in seconds. 0 means no timeout (ping -W <TIMEOUT>)
Timeout float64
// URLs to ping
Urls []string
// host ping function
pingHost HostPinger
}
func (s *Ping) Description() string {
return "Ping given url(s) and return statistics"
}
const sampleConfig = `
## urls to ping
urls = ["www.google.com"] # required
## number of pings to send per collection (ping -n <COUNT>)
count = 4 # required
## Ping timeout, in seconds. 0 means default timeout (ping -w <TIMEOUT>)
Timeout = 0
`
func (s *Ping) SampleConfig() string {
return sampleConfig
}
func hostPinger(timeout float64, args ...string) (string, error) {
bin, err := exec.LookPath("ping")
if err != nil {
return "", err
}
c := exec.Command(bin, args...)
out, err := internal.CombinedOutputTimeout(c,
time.Second*time.Duration(timeout+1))
return string(out), err
}
// processPingOutput takes in a string output from the ping command
// based on linux implementation but using regex ( multilanguage support ) ( shouldn't affect the performance of the program )
// It returns (<transmitted packets>, <received reply>, <received packet>, <average response>, <min response>, <max response>)
func processPingOutput(out string) (int, int, int, int, int, int, error) {
// So find a line contain 3 numbers except reply lines
var stats, aproxs []string = nil, nil
err := errors.New("Fatal error processing ping output")
stat := regexp.MustCompile(`=\W*(\d+)\D*=\W*(\d+)\D*=\W*(\d+)`)
aprox := regexp.MustCompile(`=\W*(\d+)\D*ms\D*=\W*(\d+)\D*ms\D*=\W*(\d+)\D*ms`)
tttLine := regexp.MustCompile(`TTL=\d+`)
lines := strings.Split(out, "\n")
var receivedReply int = 0
for _, line := range lines {
if tttLine.MatchString(line) {
receivedReply++
} else {
if stats == nil {
stats = stat.FindStringSubmatch(line)
}
if stats != nil && aproxs == nil {
aproxs = aprox.FindStringSubmatch(line)
}
}
}
// stats data should contain 4 members: entireExpression + ( Send, Receive, Lost )
if len(stats) != 4 {
return 0, 0, 0, 0, 0, 0, err
}
trans, err := strconv.Atoi(stats[1])
if err != nil {
return 0, 0, 0, 0, 0, 0, err
}
receivedPacket, err := strconv.Atoi(stats[2])
if err != nil {
return 0, 0, 0, 0, 0, 0, err
}
// aproxs data should contain 4 members: entireExpression + ( min, max, avg )
if len(aproxs) != 4 {
return trans, receivedReply, receivedPacket, 0, 0, 0, err
}
min, err := strconv.Atoi(aproxs[1])
if err != nil {
return trans, receivedReply, receivedPacket, 0, 0, 0, err
}
max, err := strconv.Atoi(aproxs[2])
if err != nil {
return trans, receivedReply, receivedPacket, 0, 0, 0, err
}
avg, err := strconv.Atoi(aproxs[3])
if err != nil {
return 0, 0, 0, 0, 0, 0, err
}
return trans, receivedReply, receivedPacket, avg, min, max, err
}
func (p *Ping) timeout() float64 {
// According to MSDN, default ping timeout for windows is 4 second
// Add also one second interval
if p.Timeout > 0 {
return p.Timeout + 1
}
return 4 + 1
}
// args returns the arguments for the 'ping' executable
func (p *Ping) args(url string) []string {
args := []string{"-n", strconv.Itoa(p.Count)}
if p.Timeout > 0 {
args = append(args, "-w", strconv.FormatFloat(p.Timeout*1000, 'f', 0, 64))
}
args = append(args, url)
return args
}
func (p *Ping) Gather(acc telegraf.Accumulator) error {
var wg sync.WaitGroup
errorChannel := make(chan error, len(p.Urls)*2)
var pendingError error = nil
// Spin off a go routine for each url to ping
for _, url := range p.Urls {
wg.Add(1)
go func(u string) {
defer wg.Done()
args := p.args(u)
totalTimeout := p.timeout() * float64(p.Count)
out, err := p.pingHost(totalTimeout, args...)
// ping host return exitcode != 0 also when there was no response from host
// but command was execute succesfully
if err != nil {
// Combine go err + stderr output
pendingError = errors.New(strings.TrimSpace(out) + ", " + err.Error())
}
tags := map[string]string{"url": u}
trans, recReply, receivePacket, avg, min, max, err := processPingOutput(out)
if err != nil {
// fatal error
if pendingError != nil {
errorChannel <- pendingError
}
errorChannel <- err
fields := map[string]interface{}{
"errors": 100.0,
}
acc.AddFields("ping", fields, tags)
return
}
// Calculate packet loss percentage
lossReply := float64(trans-recReply) / float64(trans) * 100.0
lossPackets := float64(trans-receivePacket) / float64(trans) * 100.0
fields := map[string]interface{}{
"packets_transmitted": trans,
"reply_received": recReply,
"packets_received": receivePacket,
"percent_packet_loss": lossPackets,
"percent_reply_loss": lossReply,
}
if avg > 0 {
fields["average_response_ms"] = avg
}
if min > 0 {
fields["minimum_response_ms"] = min
}
if max > 0 {
fields["maximum_response_ms"] = max
}
acc.AddFields("ping", fields, tags)
}(url)
}
wg.Wait()
close(errorChannel)
// Get all errors and return them as one giant error
errorStrings := []string{}
for err := range errorChannel {
errorStrings = append(errorStrings, err.Error())
}
if len(errorStrings) == 0 {
return nil
}
return errors.New(strings.Join(errorStrings, "\n"))
}
func init() {
inputs.Add("ping", func() telegraf.Input {
return &Ping{pingHost: hostPinger}
})
}

View File

@@ -0,0 +1,328 @@
// +build windows
package ping
import (
"errors"
"github.com/influxdata/telegraf/testutil"
"github.com/stretchr/testify/assert"
"testing"
)
// Windows ping format ( should support multilanguage ?)
var winPLPingOutput = `
Badanie 8.8.8.8 z 32 bajtami danych:
Odpowiedz z 8.8.8.8: bajtow=32 czas=49ms TTL=43
Odpowiedz z 8.8.8.8: bajtow=32 czas=46ms TTL=43
Odpowiedz z 8.8.8.8: bajtow=32 czas=48ms TTL=43
Odpowiedz z 8.8.8.8: bajtow=32 czas=57ms TTL=43
Statystyka badania ping dla 8.8.8.8:
Pakiety: Wyslane = 4, Odebrane = 4, Utracone = 0
(0% straty),
Szacunkowy czas bladzenia pakietww w millisekundach:
Minimum = 46 ms, Maksimum = 57 ms, Czas sredni = 50 ms
`
// Windows ping format ( should support multilanguage ?)
var winENPingOutput = `
Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=52ms TTL=43
Reply from 8.8.8.8: bytes=32 time=50ms TTL=43
Reply from 8.8.8.8: bytes=32 time=50ms TTL=43
Reply from 8.8.8.8: bytes=32 time=51ms TTL=43
Ping statistics for 8.8.8.8:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 50ms, Maximum = 52ms, Average = 50ms
`
func TestHost(t *testing.T) {
trans, recReply, recPacket, avg, min, max, err := processPingOutput(winPLPingOutput)
assert.NoError(t, err)
assert.Equal(t, 4, trans, "4 packets were transmitted")
assert.Equal(t, 4, recReply, "4 packets were reply")
assert.Equal(t, 4, recPacket, "4 packets were received")
assert.Equal(t, 50, avg, "Average 50")
assert.Equal(t, 46, min, "Min 46")
assert.Equal(t, 57, max, "max 57")
trans, recReply, recPacket, avg, min, max, err = processPingOutput(winENPingOutput)
assert.NoError(t, err)
assert.Equal(t, 4, trans, "4 packets were transmitted")
assert.Equal(t, 4, recReply, "4 packets were reply")
assert.Equal(t, 4, recPacket, "4 packets were received")
assert.Equal(t, 50, avg, "Average 50")
assert.Equal(t, 50, min, "Min 50")
assert.Equal(t, 52, max, "Max 52")
}
func mockHostPinger(timeout float64, args ...string) (string, error) {
return winENPingOutput, nil
}
// Test that Gather function works on a normal ping
func TestPingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.google.com", "www.reddit.com"},
pingHost: mockHostPinger,
}
p.Gather(&acc)
tags := map[string]string{"url": "www.google.com"}
fields := map[string]interface{}{
"packets_transmitted": 4,
"packets_received": 4,
"reply_received": 4,
"percent_packet_loss": 0.0,
"percent_reply_loss": 0.0,
"average_response_ms": 50,
"minimum_response_ms": 50,
"maximum_response_ms": 52,
}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
tags = map[string]string{"url": "www.reddit.com"}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
}
var errorPingOutput = `
Badanie nask.pl [195.187.242.157] z 32 bajtami danych:
Upłynął limit czasu żądania.
Upłynął limit czasu żądania.
Upłynął limit czasu żądania.
Upłynął limit czasu żądania.
Statystyka badania ping dla 195.187.242.157:
Pakiety: Wysłane = 4, Odebrane = 0, Utracone = 4
(100% straty),
`
func mockErrorHostPinger(timeout float64, args ...string) (string, error) {
return errorPingOutput, errors.New("No packets received")
}
// Test that Gather works on a ping with no transmitted packets, even though the
// command returns an error
func TestBadPingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.amazon.com"},
pingHost: mockErrorHostPinger,
}
p.Gather(&acc)
tags := map[string]string{"url": "www.amazon.com"}
fields := map[string]interface{}{
"packets_transmitted": 4,
"packets_received": 0,
"reply_received": 0,
"percent_packet_loss": 100.0,
"percent_reply_loss": 100.0,
}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
}
var lossyPingOutput = `
Badanie thecodinglove.com [66.6.44.4] z 9800 bajtami danych:
Upłynął limit czasu żądania.
Odpowiedź z 66.6.44.4: bajtów=9800 czas=114ms TTL=48
Odpowiedź z 66.6.44.4: bajtów=9800 czas=114ms TTL=48
Odpowiedź z 66.6.44.4: bajtów=9800 czas=118ms TTL=48
Odpowiedź z 66.6.44.4: bajtów=9800 czas=114ms TTL=48
Odpowiedź z 66.6.44.4: bajtów=9800 czas=114ms TTL=48
Upłynął limit czasu żądania.
Odpowiedź z 66.6.44.4: bajtów=9800 czas=119ms TTL=48
Odpowiedź z 66.6.44.4: bajtów=9800 czas=116ms TTL=48
Statystyka badania ping dla 66.6.44.4:
Pakiety: Wysłane = 9, Odebrane = 7, Utracone = 2
(22% straty),
Szacunkowy czas błądzenia pakietów w millisekundach:
Minimum = 114 ms, Maksimum = 119 ms, Czas średni = 115 ms
`
func mockLossyHostPinger(timeout float64, args ...string) (string, error) {
return lossyPingOutput, nil
}
// Test that Gather works on a ping with lossy packets
func TestLossyPingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.google.com"},
pingHost: mockLossyHostPinger,
}
p.Gather(&acc)
tags := map[string]string{"url": "www.google.com"}
fields := map[string]interface{}{
"packets_transmitted": 9,
"packets_received": 7,
"reply_received": 7,
"percent_packet_loss": 22.22222222222222,
"percent_reply_loss": 22.22222222222222,
"average_response_ms": 115,
"minimum_response_ms": 114,
"maximum_response_ms": 119,
}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
}
// Fatal ping output (invalid argument)
var fatalPingOutput = `
Bad option -d.
Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS]
[-r count] [-s count] [[-j host-list] | [-k host-list]]
[-w timeout] [-R] [-S srcaddr] [-4] [-6] target_name
Options:
-t Ping the specified host until stopped.
To see statistics and continue - type Control-Break;
To stop - type Control-C.
-a Resolve addresses to hostnames.
-n count Number of echo requests to send.
-l size Send buffer size.
-f Set Don't Fragment flag in packet (IPv4-only).
-i TTL Time To Live.
-v TOS Type Of Service (IPv4-only. This setting has been deprecated
and has no effect on the type of service field in the IP Header).
-r count Record route for count hops (IPv4-only).
-s count Timestamp for count hops (IPv4-only).
-j host-list Loose source route along host-list (IPv4-only).
-k host-list Strict source route along host-list (IPv4-only).
-w timeout Timeout in milliseconds to wait for each reply.
-R Use routing header to test reverse route also (IPv6-only).
-S srcaddr Source address to use.
-4 Force using IPv4.
-6 Force using IPv6.
`
func mockFatalHostPinger(timeout float64, args ...string) (string, error) {
return fatalPingOutput, errors.New("So very bad")
}
// Test that a fatal ping command does not gather any statistics.
func TestFatalPingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.amazon.com"},
pingHost: mockFatalHostPinger,
}
p.Gather(&acc)
assert.True(t, acc.HasFloatField("ping", "errors"),
"Fatal ping should have packet measurements")
assert.False(t, acc.HasIntField("ping", "packets_transmitted"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "packets_received"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasFloatField("ping", "percent_packet_loss"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasFloatField("ping", "percent_reply_loss"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "average_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "maximum_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "minimum_response_ms"),
"Fatal ping should not have packet measurements")
}
var UnreachablePingOutput = `
Pinging www.google.pl [8.8.8.8] with 32 bytes of data:
Request timed out.
Request timed out.
Reply from 194.204.175.50: Destination net unreachable.
Request timed out.
Ping statistics for 8.8.8.8:
Packets: Sent = 4, Received = 1, Lost = 3 (75% loss),
`
func mockUnreachableHostPinger(timeout float64, args ...string) (string, error) {
return UnreachablePingOutput, errors.New("So very bad")
}
//Reply from 185.28.251.217: TTL expired in transit.
// in case 'Destination net unreachable' ping app return receive packet which is not what we need
// it's not contain valid metric so treat it as lost one
func TestUnreachablePingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.google.com"},
pingHost: mockUnreachableHostPinger,
}
p.Gather(&acc)
tags := map[string]string{"url": "www.google.com"}
fields := map[string]interface{}{
"packets_transmitted": 4,
"packets_received": 1,
"reply_received": 0,
"percent_packet_loss": 75.0,
"percent_reply_loss": 100.0,
}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
assert.False(t, acc.HasFloatField("ping", "errors"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "average_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "maximum_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "minimum_response_ms"),
"Fatal ping should not have packet measurements")
}
var TTLExpiredPingOutput = `
Pinging www.google.pl [8.8.8.8] with 32 bytes of data:
Request timed out.
Request timed out.
Reply from 185.28.251.217: TTL expired in transit.
Request timed out.
Ping statistics for 8.8.8.8:
Packets: Sent = 4, Received = 1, Lost = 3 (75% loss),
`
func mockTTLExpiredPinger(timeout float64, args ...string) (string, error) {
return TTLExpiredPingOutput, errors.New("So very bad")
}
// in case 'Destination net unreachable' ping app return receive packet which is not what we need
// it's not contain valid metric so treat it as lost one
func TestTTLExpiredPingGather(t *testing.T) {
var acc testutil.Accumulator
p := Ping{
Urls: []string{"www.google.com"},
pingHost: mockTTLExpiredPinger,
}
p.Gather(&acc)
tags := map[string]string{"url": "www.google.com"}
fields := map[string]interface{}{
"packets_transmitted": 4,
"packets_received": 1,
"reply_received": 0,
"percent_packet_loss": 75.0,
"percent_reply_loss": 100.0,
}
acc.AssertContainsTaggedFields(t, "ping", fields, tags)
assert.False(t, acc.HasFloatField("ping", "errors"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "average_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "maximum_response_ms"),
"Fatal ping should not have packet measurements")
assert.False(t, acc.HasIntField("ping", "minimum_response_ms"),
"Fatal ping should not have packet measurements")
}

View File

@@ -266,29 +266,33 @@ func (p *Postgresql) accRow(meas_name string, row scanner, acc telegraf.Accumula
tags := map[string]string{}
tags["server"] = tagAddress
tags["db"] = dbname.String()
var isATag int
fields := make(map[string]interface{})
COLUMN:
for col, val := range columnMap {
if acc.Debug() {
log.Printf("postgresql_extensible: column: %s = %T: %s\n", col, *val, *val)
}
_, ignore := ignoredColumns[col]
if !ignore && *val != nil {
isATag = 0
for tag := range p.AdditionalTags {
if col == p.AdditionalTags[tag] {
isATag = 1
value_type_p := fmt.Sprintf(`%T`, *val)
if value_type_p == "[]uint8" {
tags[col] = fmt.Sprintf(`%s`, *val)
} else if value_type_p == "int64" {
tags[col] = fmt.Sprintf(`%v`, *val)
}
}
if ignore || *val == nil {
continue
}
for _, tag := range p.AdditionalTags {
if col != tag {
continue
}
if isATag == 0 {
fields[col] = *val
switch v := (*val).(type) {
case []byte:
tags[col] = string(v)
case int64:
tags[col] = fmt.Sprintf("%d", v)
}
continue COLUMN
}
if v, ok := (*val).([]byte); ok {
fields[col] = string(v)
} else {
fields[col] = *val
}
}
acc.AddFields(meas_name, fields, tags)

View File

@@ -71,7 +71,7 @@ func (p *SpecProcessor) pushMetrics() {
fields[prefix+"read_count"] = io.ReadCount
fields[prefix+"write_count"] = io.WriteCount
fields[prefix+"read_bytes"] = io.ReadBytes
fields[prefix+"write_bytes"] = io.WriteCount
fields[prefix+"write_bytes"] = io.WriteBytes
}
cpu_time, err := p.proc.Times()

View File

@@ -9,35 +9,59 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/internal/errchan"
"github.com/influxdata/telegraf/plugins/inputs"
)
// DefaultUsername will set a default value that corrasponds to the default
// value used by Rabbitmq
const DefaultUsername = "guest"
// DefaultPassword will set a default value that corrasponds to the default
// value used by Rabbitmq
const DefaultPassword = "guest"
// DefaultURL will set a default value that corrasponds to the default value
// used by Rabbitmq
const DefaultURL = "http://localhost:15672"
// RabbitMQ defines the configuration necessary for gathering metrics,
// see the sample config for further details
type RabbitMQ struct {
URL string
Name string
Username string
Password string
Nodes []string
Queues []string
// Path to CA file
SSLCA string `toml:"ssl_ca"`
// Path to host cert file
SSLCert string `toml:"ssl_cert"`
// Path to cert key file
SSLKey string `toml:"ssl_key"`
// Use SSL but skip chain & host verification
InsecureSkipVerify bool
// InsecureSkipVerify bool
Nodes []string
Queues []string
Client *http.Client
}
// OverviewResponse ...
type OverviewResponse struct {
MessageStats *MessageStats `json:"message_stats"`
ObjectTotals *ObjectTotals `json:"object_totals"`
QueueTotals *QueueTotals `json:"queue_totals"`
}
// Details ...
type Details struct {
Rate float64
}
// MessageStats ...
type MessageStats struct {
Ack int64
AckDetails Details `json:"ack_details"`
@@ -51,6 +75,7 @@ type MessageStats struct {
RedeliverDetails Details `json:"redeliver_details"`
}
// ObjectTotals ...
type ObjectTotals struct {
Channels int64
Connections int64
@@ -59,6 +84,7 @@ type ObjectTotals struct {
Queues int64
}
// QueueTotals ...
type QueueTotals struct {
Messages int64
MessagesReady int64 `json:"messages_ready"`
@@ -66,10 +92,11 @@ type QueueTotals struct {
MessageBytes int64 `json:"message_bytes"`
MessageBytesReady int64 `json:"message_bytes_ready"`
MessageBytesUnacknowledged int64 `json:"message_bytes_unacknowledged"`
MessageRam int64 `json:"message_bytes_ram"`
MessageRAM int64 `json:"message_bytes_ram"`
MessagePersistent int64 `json:"message_bytes_persistent"`
}
// Queue ...
type Queue struct {
QueueTotals // just to not repeat the same code
MessageStats `json:"message_stats"`
@@ -83,6 +110,7 @@ type Queue struct {
AutoDelete bool `json:"auto_delete"`
}
// Node ...
type Node struct {
Name string
@@ -99,6 +127,7 @@ type Node struct {
SocketsUsed int64 `json:"sockets_used"`
}
// gatherFunc ...
type gatherFunc func(r *RabbitMQ, acc telegraf.Accumulator, errChan chan error)
var gatherFunctions = []gatherFunc{gatherOverview, gatherNodes, gatherQueues}
@@ -109,22 +138,40 @@ var sampleConfig = `
# username = "guest"
# password = "guest"
## Optional SSL Config
# ssl_ca = "/etc/telegraf/ca.pem"
# ssl_cert = "/etc/telegraf/cert.pem"
# ssl_key = "/etc/telegraf/key.pem"
## Use SSL but skip chain & host verification
# insecure_skip_verify = false
## A list of nodes to pull metrics about. If not specified, metrics for
## all nodes are gathered.
# nodes = ["rabbit@node1", "rabbit@node2"]
`
// SampleConfig ...
func (r *RabbitMQ) SampleConfig() string {
return sampleConfig
}
// Description ...
func (r *RabbitMQ) Description() string {
return "Read metrics from one or many RabbitMQ servers via the management API"
}
// Gather ...
func (r *RabbitMQ) Gather(acc telegraf.Accumulator) error {
if r.Client == nil {
tr := &http.Transport{ResponseHeaderTimeout: time.Duration(3 * time.Second)}
tlsCfg, err := internal.GetTLSConfig(
r.SSLCert, r.SSLKey, r.SSLCA, r.InsecureSkipVerify)
if err != nil {
return err
}
tr := &http.Transport{
ResponseHeaderTimeout: time.Duration(3 * time.Second),
TLSClientConfig: tlsCfg,
}
r.Client = &http.Client{
Transport: tr,
Timeout: time.Duration(4 * time.Second),
@@ -286,7 +333,7 @@ func gatherQueues(r *RabbitMQ, acc telegraf.Accumulator, errChan chan error) {
"message_bytes": queue.MessageBytes,
"message_bytes_ready": queue.MessageBytesReady,
"message_bytes_unacked": queue.MessageBytesUnacknowledged,
"message_bytes_ram": queue.MessageRam,
"message_bytes_ram": queue.MessageRAM,
"message_bytes_persist": queue.MessagePersistent,
"messages": queue.Messages,
"messages_ready": queue.MessagesReady,

View File

@@ -99,7 +99,7 @@ func (r *Redis) Gather(acc telegraf.Accumulator) error {
var wg sync.WaitGroup
errChan := errchan.New(len(r.Servers))
for _, serv := range r.Servers {
if !strings.HasPrefix(serv, "tcp://") || !strings.HasPrefix(serv, "unix://") {
if !strings.HasPrefix(serv, "tcp://") && !strings.HasPrefix(serv, "unix://") {
serv = "tcp://" + serv
}

View File

@@ -0,0 +1,47 @@
# sensors Input Plugin
Collect [lm-sensors](https://en.wikipedia.org/wiki/Lm_sensors) metrics - requires the lm-sensors
package installed.
This plugin collects sensor metrics with the `sensors` executable from the lm-sensor package.
### Configuration:
```
# Monitor sensors, requires lm-sensors package
[[inputs.sensors]]
## Remove numbers from field names.
## If true, a field name like 'temp1_input' will be changed to 'temp_input'.
# remove_numbers = true
```
### Measurements & Fields:
Fields are created dynamicaly depending on the sensors. All fields are float.
### Tags:
- All measurements have the following tags:
- chip
- feature
### Example Output:
#### Default
```
$ telegraf -config telegraf.conf -input-filter sensors -test
* Plugin: sensors, Collection 1
> sensors,chip=power_meter-acpi-0,feature=power1 power_average=0,power_average_interval=300 1466751326000000000
> sensors,chip=k10temp-pci-00c3,feature=temp1 temp_crit=70,temp_crit_hyst=65,temp_input=29,temp_max=70 1466751326000000000
> sensors,chip=k10temp-pci-00cb,feature=temp1 temp_input=29,temp_max=70 1466751326000000000
> sensors,chip=k10temp-pci-00d3,feature=temp1 temp_input=27.5,temp_max=70 1466751326000000000
> sensors,chip=k10temp-pci-00db,feature=temp1 temp_crit=70,temp_crit_hyst=65,temp_input=29.5,temp_max=70 1466751326000000000
```
#### With remove_numbers=false
```
* Plugin: sensors, Collection 1
> sensors,chip=power_meter-acpi-0,feature=power1 power1_average=0,power1_average_interval=300 1466753424000000000
> sensors,chip=k10temp-pci-00c3,feature=temp1 temp1_crit=70,temp1_crit_hyst=65,temp1_input=29.125,temp1_max=70 1466753424000000000
> sensors,chip=k10temp-pci-00cb,feature=temp1 temp1_input=29,temp1_max=70 1466753424000000000
> sensors,chip=k10temp-pci-00d3,feature=temp1 temp1_input=29.5,temp1_max=70 1466753424000000000
> sensors,chip=k10temp-pci-00db,feature=temp1 temp1_crit=70,temp1_crit_hyst=65,temp1_input=30,temp1_max=70 1466753424000000000
```

View File

@@ -1,91 +1,118 @@
// +build linux,sensors
// +build linux
package sensors
import (
"errors"
"fmt"
"os/exec"
"regexp"
"strconv"
"strings"
"github.com/md14454/gosensors"
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/plugins/inputs"
)
var (
execCommand = exec.Command // execCommand is used to mock commands in tests.
numberRegp = regexp.MustCompile("[0-9]+")
)
type Sensors struct {
Sensors []string
RemoveNumbers bool `toml:"remove_numbers"`
path string
}
func (_ *Sensors) Description() string {
return "Monitor sensors using lm-sensors package"
func (*Sensors) Description() string {
return "Monitor sensors, requires lm-sensors package"
}
var sensorsSampleConfig = `
## By default, telegraf gathers stats from all sensors detected by the
## lm-sensors module.
##
## Only collect stats from the selected sensors. Sensors are listed as
## <chip name>:<feature name>. This information can be found by running the
## sensors command, e.g. sensors -u
##
## A * as the feature name will return all features of the chip
##
# sensors = ["coretemp-isa-0000:Core 0", "coretemp-isa-0001:*"]
func (*Sensors) SampleConfig() string {
return `
## Remove numbers from field names.
## If true, a field name like 'temp1_input' will be changed to 'temp_input'.
# remove_numbers = true
`
func (_ *Sensors) SampleConfig() string {
return sensorsSampleConfig
}
func (s *Sensors) Gather(acc telegraf.Accumulator) error {
gosensors.Init()
defer gosensors.Cleanup()
for _, chip := range gosensors.GetDetectedChips() {
for _, feature := range chip.GetFeatures() {
chipName := chip.String()
featureLabel := feature.GetLabel()
if len(s.Sensors) != 0 {
var found bool
for _, sensor := range s.Sensors {
parts := strings.SplitN(sensor, ":", 2)
if parts[0] == chipName {
if parts[1] == "*" || parts[1] == featureLabel {
found = true
break
}
}
}
if !found {
continue
}
}
tags := map[string]string{
"chip": chipName,
"adapter": chip.AdapterName(),
"feature-name": feature.Name,
"feature-label": featureLabel,
}
fieldName := chipName + ":" + featureLabel
fields := map[string]interface{}{
fieldName: feature.GetValue(),
}
acc.AddFields("sensors", fields, tags)
}
if len(s.path) == 0 {
return errors.New("sensors not found: verify that lm-sensors package is installed and that sensors is in your PATH")
}
return s.parse(acc)
}
// parse forks the command:
// sensors -u -A
// and parses the output to add it to the telegraf.Accumulator.
func (s *Sensors) parse(acc telegraf.Accumulator) error {
tags := map[string]string{}
fields := map[string]interface{}{}
chip := ""
cmd := execCommand(s.path, "-A", "-u")
out, err := internal.CombinedOutputTimeout(cmd, time.Second*5)
if err != nil {
return fmt.Errorf("failed to run command %s: %s - %s", strings.Join(cmd.Args, " "), err, string(out))
}
lines := strings.Split(strings.TrimSpace(string(out)), "\n")
for _, line := range lines {
if len(line) == 0 {
acc.AddFields("sensors", fields, tags)
chip = ""
tags = map[string]string{}
fields = map[string]interface{}{}
continue
}
if len(chip) == 0 {
chip = line
tags["chip"] = chip
continue
}
if !strings.HasPrefix(line, " ") {
if len(tags) > 1 {
acc.AddFields("sensors", fields, tags)
}
fields = map[string]interface{}{}
tags = map[string]string{
"chip": chip,
"feature": strings.TrimRight(snake(line), ":"),
}
} else {
splitted := strings.Split(line, ":")
fieldName := strings.TrimSpace(splitted[0])
if s.RemoveNumbers {
fieldName = numberRegp.ReplaceAllString(fieldName, "")
}
fieldValue, err := strconv.ParseFloat(strings.TrimSpace(splitted[1]), 64)
if err != nil {
return err
}
fields[fieldName] = fieldValue
}
}
acc.AddFields("sensors", fields, tags)
return nil
}
func init() {
s := Sensors{
RemoveNumbers: true,
}
path, _ := exec.LookPath("sensors")
if len(path) > 0 {
s.path = path
}
inputs.Add("sensors", func() telegraf.Input {
return &Sensors{}
return &s
})
}
// snake converts string to snake case
func snake(input string) string {
return strings.ToLower(strings.Replace(input, " ", "_", -1))
}

View File

@@ -1,3 +0,0 @@
// +build !linux !sensors
package sensors

View File

@@ -0,0 +1,3 @@
// +build !linux
package sensors

View File

@@ -0,0 +1,328 @@
// +build linux
package sensors
import (
"fmt"
"os"
"os/exec"
"testing"
"github.com/influxdata/telegraf/testutil"
)
func TestGatherDefault(t *testing.T) {
s := Sensors{
RemoveNumbers: true,
path: "sensors",
}
// overwriting exec commands with mock commands
execCommand = fakeExecCommand
defer func() { execCommand = exec.Command }()
var acc testutil.Accumulator
err := s.Gather(&acc)
if err != nil {
t.Fatal(err)
}
var tests = []struct {
tags map[string]string
fields map[string]interface{}
}{
{
map[string]string{
"chip": "acpitz-virtual-0",
"feature": "temp1",
},
map[string]interface{}{
"temp_input": 8.3,
"temp_crit": 31.3,
},
},
{
map[string]string{
"chip": "power_meter-acpi-0",
"feature": "power1",
},
map[string]interface{}{
"power_average": 0.0,
"power_average_interval": 300.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "physical_id_0",
},
map[string]interface{}{
"temp_input": 77.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "core_0",
},
map[string]interface{}{
"temp_input": 75.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "core_1",
},
map[string]interface{}{
"temp_input": 77.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "physical_id_1",
},
map[string]interface{}{
"temp_input": 70.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "core_0",
},
map[string]interface{}{
"temp_input": 66.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "core_1",
},
map[string]interface{}{
"temp_input": 70.0,
"temp_max": 82.0,
"temp_crit": 92.0,
"temp_crit_alarm": 0.0,
},
},
}
for _, test := range tests {
acc.AssertContainsTaggedFields(t, "sensors", test.fields, test.tags)
}
}
func TestGatherNotRemoveNumbers(t *testing.T) {
s := Sensors{
RemoveNumbers: false,
path: "sensors",
}
// overwriting exec commands with mock commands
execCommand = fakeExecCommand
defer func() { execCommand = exec.Command }()
var acc testutil.Accumulator
err := s.Gather(&acc)
if err != nil {
t.Fatal(err)
}
var tests = []struct {
tags map[string]string
fields map[string]interface{}
}{
{
map[string]string{
"chip": "acpitz-virtual-0",
"feature": "temp1",
},
map[string]interface{}{
"temp1_input": 8.3,
"temp1_crit": 31.3,
},
},
{
map[string]string{
"chip": "power_meter-acpi-0",
"feature": "power1",
},
map[string]interface{}{
"power1_average": 0.0,
"power1_average_interval": 300.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "physical_id_0",
},
map[string]interface{}{
"temp1_input": 77.0,
"temp1_max": 82.0,
"temp1_crit": 92.0,
"temp1_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "core_0",
},
map[string]interface{}{
"temp2_input": 75.0,
"temp2_max": 82.0,
"temp2_crit": 92.0,
"temp2_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0000",
"feature": "core_1",
},
map[string]interface{}{
"temp3_input": 77.0,
"temp3_max": 82.0,
"temp3_crit": 92.0,
"temp3_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "physical_id_1",
},
map[string]interface{}{
"temp1_input": 70.0,
"temp1_max": 82.0,
"temp1_crit": 92.0,
"temp1_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "core_0",
},
map[string]interface{}{
"temp2_input": 66.0,
"temp2_max": 82.0,
"temp2_crit": 92.0,
"temp2_crit_alarm": 0.0,
},
},
{
map[string]string{
"chip": "coretemp-isa-0001",
"feature": "core_1",
},
map[string]interface{}{
"temp3_input": 70.0,
"temp3_max": 82.0,
"temp3_crit": 92.0,
"temp3_crit_alarm": 0.0,
},
},
}
for _, test := range tests {
acc.AssertContainsTaggedFields(t, "sensors", test.fields, test.tags)
}
}
// fackeExecCommand is a helper function that mock
// the exec.Command call (and call the test binary)
func fakeExecCommand(command string, args ...string) *exec.Cmd {
cs := []string{"-test.run=TestHelperProcess", "--", command}
cs = append(cs, args...)
cmd := exec.Command(os.Args[0], cs...)
cmd.Env = []string{"GO_WANT_HELPER_PROCESS=1"}
return cmd
}
// TestHelperProcess isn't a real test. It's used to mock exec.Command
// For example, if you run:
// GO_WANT_HELPER_PROCESS=1 go test -test.run=TestHelperProcess -- chrony tracking
// it returns below mockData.
func TestHelperProcess(t *testing.T) {
if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
return
}
mockData := `acpitz-virtual-0
temp1:
temp1_input: 8.300
temp1_crit: 31.300
power_meter-acpi-0
power1:
power1_average: 0.000
power1_average_interval: 300.000
coretemp-isa-0000
Physical id 0:
temp1_input: 77.000
temp1_max: 82.000
temp1_crit: 92.000
temp1_crit_alarm: 0.000
Core 0:
temp2_input: 75.000
temp2_max: 82.000
temp2_crit: 92.000
temp2_crit_alarm: 0.000
Core 1:
temp3_input: 77.000
temp3_max: 82.000
temp3_crit: 92.000
temp3_crit_alarm: 0.000
coretemp-isa-0001
Physical id 1:
temp1_input: 70.000
temp1_max: 82.000
temp1_crit: 92.000
temp1_crit_alarm: 0.000
Core 0:
temp2_input: 66.000
temp2_max: 82.000
temp2_crit: 92.000
temp2_crit_alarm: 0.000
Core 1:
temp3_input: 70.000
temp3_max: 82.000
temp3_crit: 92.000
temp3_crit_alarm: 0.000
`
args := os.Args
// Previous arguments are tests stuff, that looks like :
// /tmp/go-build970079519/…/_test/integration.test -test.run=TestHelperProcess --
cmd, args := args[3], args[4:]
if cmd == "sensors" {
fmt.Fprint(os.Stdout, mockData)
} else {
fmt.Fprint(os.Stdout, "command not found")
os.Exit(1)
}
os.Exit(0)
}

View File

@@ -1,549 +1,167 @@
# SNMP Input Plugin
# SNMP Plugin
The SNMP input plugin gathers metrics from SNMP agents
The SNMP input plugin gathers metrics from SNMP agents.
### Configuration:
## Configuration:
### Example:
#### Very simple example
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.2.1.4.1`
```toml
# Very Simple Example
[[inputs.snmp]]
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Simple list of OIDs to get, in addition to "collect"
get_oids = [".1.3.6.1.2.1.2.2.1.4.1"]
SNMP data:
```
.1.0.0.0.1.1.0 octet_str "foo"
.1.0.0.0.1.1.1 octet_str "bar"
.1.0.0.0.1.102 octet_str "bad"
.1.0.0.0.1.2.0 integer 1
.1.0.0.0.1.2.1 integer 2
.1.0.0.0.1.3.0 octet_str "0.123"
.1.0.0.0.1.3.1 octet_str "0.456"
.1.0.0.0.1.3.2 octet_str "9.999"
.1.0.0.1.1 octet_str "baz"
.1.0.0.1.2 uinteger 54321
.1.0.0.1.3 uinteger 234
```
#### Simple example
In this example, Telegraf gathers value of OIDS:
- named **ifnumber**
- named **interface_speed**
With **inputs.snmp.get** section the plugin gets the oid number:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed*
As you can see *ifSpeed* is not a valid OID. In order to get
the valid OID, the plugin uses `snmptranslate_file` to match the OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5`
Also as the plugin will append `instance` to the corresponding OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5.1`
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.1.0`
- `.1.3.6.1.2.1.2.2.1.5.1`
Telegraf config:
```toml
# Simple example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
collect = ["ifnumber", "interface_speed"]
agents = [ "127.0.0.1:161" ]
version = 2
community = "public"
[[inputs.snmp.get]]
name = "ifnumber"
oid = ".1.3.6.1.2.1.2.1.0"
name = "system"
[[inputs.snmp.field]]
name = "hostname"
oid = ".1.0.0.1.1"
is_tag = true
[[inputs.snmp.field]]
name = "uptime"
oid = ".1.0.0.1.2"
[[inputs.snmp.field]]
name = "loadavg"
oid = ".1.0.0.1.3"
conversion = "float(2)"
[[inputs.snmp.get]]
name = "interface_speed"
oid = "ifSpeed"
instance = "1"
```
#### Simple bulk example
In this example, Telegraf gathers value of OIDS:
- named **ifnumber**
- named **interface_speed**
- named **if_out_octets**
With **inputs.snmp.get** section the plugin gets oid number:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed*
With **inputs.snmp.bulk** section the plugin gets the oid number:
- **if_out_octets** => *ifOutOctets*
As you can see *ifSpeed* and *ifOutOctets* are not a valid OID.
In order to get the valid OID, the plugin uses `snmptranslate_file`
to match the OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5`
- **if_out_octets** => *ifOutOctets* => `.1.3.6.1.2.1.2.2.1.16`
Also, the plugin will append `instance` to the corresponding OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5.1`
And **if_out_octets** is a bulk request, the plugin will gathers all
OIDS in the table.
- `.1.3.6.1.2.1.2.2.1.16.1`
- `.1.3.6.1.2.1.2.2.1.16.2`
- `.1.3.6.1.2.1.2.2.1.16.3`
- `.1.3.6.1.2.1.2.2.1.16.4`
- `.1.3.6.1.2.1.2.2.1.16.5`
- `...`
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.1.0`
- `.1.3.6.1.2.1.2.2.1.5.1`
- `.1.3.6.1.2.1.2.2.1.16.1`
- `.1.3.6.1.2.1.2.2.1.16.2`
- `.1.3.6.1.2.1.2.2.1.16.3`
- `.1.3.6.1.2.1.2.2.1.16.4`
- `.1.3.6.1.2.1.2.2.1.16.5`
- `...`
```toml
# Simple bulk example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
collect = ["interface_speed", "if_number", "if_out_octets"]
[[inputs.snmp.get]]
name = "interface_speed"
oid = "ifSpeed"
instance = "1"
[[inputs.snmp.get]]
name = "if_number"
oid = "ifNumber"
[[inputs.snmp.bulk]]
name = "if_out_octets"
oid = "ifOutOctets"
```
#### Table example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Note: This example is like a bulk request a but using an
other configuration
Telegraf gathers value of OIDS of the table:
- named **iftable1**
With **inputs.snmp.table** section the plugin gets oid number:
- **iftable1** => `.1.3.6.1.2.1.31.1.1.1`
Also **iftable1** is a table, the plugin will gathers all
OIDS in the table and in the subtables
- `.1.3.6.1.2.1.31.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.1....`
- `.1.3.6.1.2.1.31.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.2....`
- `.1.3.6.1.2.1.31.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.3....`
- `.1.3.6.1.2.1.31.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.4....`
- `.1.3.6.1.2.1.31.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.5....`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `...`
```toml
# Table example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable1"
# table without mapping neither subtables
# This is like bulk request
[[inputs.snmp.table]]
name = "iftable1"
oid = ".1.3.6.1.2.1.31.1.1.1"
name = "remote_servers"
inherit_tags = [ "hostname" ]
[[inputs.snmp.table.field]]
name = "server"
oid = ".1.0.0.0.1.1"
is_tag = true
[[inputs.snmp.table.field]]
name = "connections"
oid = ".1.0.0.0.1.2"
[[inputs.snmp.table.field]]
name = "latency"
oid = ".1.0.0.0.1.3"
conversion = "float"
```
Resulting output:
```
* Plugin: snmp, Collection 1
> system,agent_host=127.0.0.1,host=mylocalhost,hostname=baz loadavg=2.34,uptime=54321i 1468953135000000000
> remote_servers,agent_host=127.0.0.1,host=mylocalhost,hostname=baz,server=foo connections=1i,latency=0.123 1468953135000000000
> remote_servers,agent_host=127.0.0.1,host=mylocalhost,hostname=baz,server=bar connections=2i,latency=0.456 1468953135000000000
```
#### Table with subtable example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Note: This example is like a bulk request a but using an
other configuration
Telegraf gathers value of OIDS of the table:
- named **iftable2**
With **inputs.snmp.table** section *AND* **sub_tables** attribute,
the plugin will get OIDS from subtables:
- **iftable2** => `.1.3.6.1.2.1.2.2.1.13`
Also **iftable2** is a table, the plugin will gathers all
OIDS in subtables:
- `.1.3.6.1.2.1.2.2.1.13.1`
- `.1.3.6.1.2.1.2.2.1.13.2`
- `.1.3.6.1.2.1.2.2.1.13.3`
- `.1.3.6.1.2.1.2.2.1.13.4`
- `.1.3.6.1.2.1.2.2.1.13....`
#### Configuration via MIB:
This example uses the SNMP data above, but is configured via the MIB.
The example MIB file can be found in the `testdata` directory. See the [MIB lookups](#mib-lookups) section for more information.
Telegraf config:
```toml
# Table with subtable example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable2"
agents = [ "127.0.0.1:161" ]
version = 2
community = "public"
[[inputs.snmp.field]]
oid = "TEST::hostname"
is_tag = true
# table without mapping but with subtables
[[inputs.snmp.table]]
name = "iftable2"
sub_tables = [".1.3.6.1.2.1.2.2.1.13"]
# note
# oid attribute is useless
oid = "TEST::testTable"
inherit_tags = "hostname"
```
#### Table with mapping example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Telegraf gathers value of OIDS of the table:
- named **iftable3**
With **inputs.snmp.table** section the plugin gets oid number:
- **iftable3** => `.1.3.6.1.2.1.31.1.1.1`
Also **iftable2** is a table, the plugin will gathers all
OIDS in the table and in the subtables
- `.1.3.6.1.2.1.31.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.1....`
- `.1.3.6.1.2.1.31.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.2....`
- `.1.3.6.1.2.1.31.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.3....`
- `.1.3.6.1.2.1.31.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.4....`
- `.1.3.6.1.2.1.31.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.5....`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `...`
But the **include_instances** attribute will filter which OIDS
will be gathered; As you see, there is an other attribute, `mapping_table`.
`include_instances` and `mapping_table` permit to build a hash table
to filter only OIDS you want.
Let's say, we have the following data on SNMP server:
- OID: `.1.3.6.1.2.1.31.1.1.1.1.1` has as value: `enp5s0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.2` has as value: `enp5s1`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.3` has as value: `enp5s2`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.4` has as value: `eth0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.5` has as value: `eth1`
The plugin will build the following hash table:
| instance name | instance id |
|---------------|-------------|
| `enp5s0` | `1` |
| `enp5s1` | `2` |
| `enp5s2` | `3` |
| `eth0` | `4` |
| `eth1` | `5` |
With the **include_instances** attribute, the plugin will gather
the following OIDS:
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.2.1`
- `.1.3.6.1.2.1.31.1.1.1.2.5`
- `.1.3.6.1.2.1.31.1.1.1.3.1`
- `.1.3.6.1.2.1.31.1.1.1.3.5`
- `.1.3.6.1.2.1.31.1.1.1.4.1`
- `.1.3.6.1.2.1.31.1.1.1.4.5`
- `.1.3.6.1.2.1.31.1.1.1.5.1`
- `.1.3.6.1.2.1.31.1.1.1.5.5`
- `.1.3.6.1.2.1.31.1.1.1.6.1`
- `.1.3.6.1.2.1.31.1.1.1.6.5`
- `...`
Note: the plugin will add instance name as tag *instance*
```toml
# Simple table with mapping example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable3"
include_instances = ["enp5s0", "eth1"]
# table with mapping but without subtables
[[inputs.snmp.table]]
name = "iftable3"
oid = ".1.3.6.1.2.1.31.1.1.1"
# if empty. get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty, get all subtables
Resulting output:
```
* Plugin: snmp, Collection 1
> testTable,agent_host=127.0.0.1,host=mylocalhost,hostname=baz,server=foo connections=1i,latency="0.123" 1468953135000000000
> testTable,agent_host=127.0.0.1,host=mylocalhost,hostname=baz,server=bar connections=2i,latency="0.456" 1468953135000000000
```
### Config parameters
#### Table with both mapping and subtable example
* `agents`: Default: `[]`
List of SNMP agents to connect to in the form of `IP[:PORT]`. If `:PORT` is unspecified, it defaults to `161`.
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
* `version`: Default: `2`
SNMP protocol version to use.
Telegraf gathers value of OIDS of the table:
* `community`: Default: `"public"`
SNMP community to use.
- named **iftable4**
* `max_repetitions`: Default: `50`
Maximum number of iterations for repeating variables.
With **inputs.snmp.table** section *AND* **sub_tables** attribute,
the plugin will get OIDS from subtables:
* `sec_name`:
Security name for authenticated SNMPv3 requests.
- **iftable4** => `.1.3.6.1.2.1.31.1.1.1`
* `auth_protocol`: Values: `"MD5"`,`"SHA"`,`""`. Default: `""`
Authentication protocol for authenticated SNMPv3 requests.
Also **iftable2** is a table, the plugin will gathers all
OIDS in the table and in the subtables
* `auth_password`:
Authentication password for authenticated SNMPv3 requests.
- `.1.3.6.1.2.1.31.1.1.1.6.1
- `.1.3.6.1.2.1.31.1.1.1.6.2`
- `.1.3.6.1.2.1.31.1.1.1.6.3`
- `.1.3.6.1.2.1.31.1.1.1.6.4`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `.1.3.6.1.2.1.31.1.1.1.10.1`
- `.1.3.6.1.2.1.31.1.1.1.10.2`
- `.1.3.6.1.2.1.31.1.1.1.10.3`
- `.1.3.6.1.2.1.31.1.1.1.10.4`
- `.1.3.6.1.2.1.31.1.1.1.10....`
* `sec_level`: Values: `"noAuthNoPriv"`,`"authNoPriv"`,`"authPriv"`. Default: `"noAuthNoPriv"`
Security level used for SNMPv3 messages.
But the **include_instances** attribute will filter which OIDS
will be gathered; As you see, there is an other attribute, `mapping_table`.
`include_instances` and `mapping_table` permit to build a hash table
to filter only OIDS you want.
Let's say, we have the following data on SNMP server:
- OID: `.1.3.6.1.2.1.31.1.1.1.1.1` has as value: `enp5s0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.2` has as value: `enp5s1`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.3` has as value: `enp5s2`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.4` has as value: `eth0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.5` has as value: `eth1`
* `context_name`:
Context name used for SNMPv3 requests.
The plugin will build the following hash table:
* `priv_protocol`: Values: `"DES"`,`"AES"`,`""`. Default: `""`
Privacy protocol used for encrypted SNMPv3 messages.
| instance name | instance id |
|---------------|-------------|
| `enp5s0` | `1` |
| `enp5s1` | `2` |
| `enp5s2` | `3` |
| `eth0` | `4` |
| `eth1` | `5` |
With the **include_instances** attribute, the plugin will gather
the following OIDS:
- `.1.3.6.1.2.1.31.1.1.1.6.1`
- `.1.3.6.1.2.1.31.1.1.1.6.5`
- `.1.3.6.1.2.1.31.1.1.1.10.1`
- `.1.3.6.1.2.1.31.1.1.1.10.5`
Note: the plugin will add instance name as tag *instance*
* `priv_password`:
Privacy password used for encrypted SNMPv3 messages.
* `name`:
Output measurement name.
```toml
# Table with both mapping and subtable example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable4"
include_instances = ["enp5s0", "eth1"]
#### Field parameters:
* `oid`:
OID to get. May be a numeric or textual OID.
# table with both mapping and subtables
[[inputs.snmp.table]]
name = "iftable4"
# if empty get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty get all subtables
# sub_tables could be not "real subtables"
sub_tables=[".1.3.6.1.2.1.2.2.1.13", "bytes_recv", "bytes_send"]
# note
# oid attribute is useless
* `name`:
Output field/tag name.
If not specified, it defaults to the value of `oid`. If `oid` is numeric, an attempt to translate the numeric OID into a texual OID will be made.
# SNMP SUBTABLES
[[inputs.snmp.subtable]]
name = "bytes_recv"
oid = ".1.3.6.1.2.1.31.1.1.1.6"
unit = "octets"
* `is_tag`:
Output this field as a tag.
[[inputs.snmp.subtable]]
name = "bytes_send"
oid = ".1.3.6.1.2.1.31.1.1.1.10"
unit = "octets"
```
* `conversion`: Values: `"float(X)"`,`"float"`,`"int"`,`""`. Default: `""`
Converts the value according to the given specification.
#### Configuration notes
- `float(X)`: Converts the input value into a float and divides by the Xth power of 10. Efficively just moves the decimal left X places. For example a value of `123` with `float(2)` will result in `1.23`.
- `float`: Converts the value into a float with no adjustment. Same as `float(0)`.
- `int`: Convertes the value into an integer.
- In **inputs.snmp.table** section, the `oid` attribute is useless if
the `sub_tables` attributes is defined
#### Table parameters:
* `oid`:
Automatically populates the table's fields using data from the MIB.
- In **inputs.snmp.subtable** section, you can put a name from `snmptranslate_file`
as `oid` attribute instead of a valid OID
* `name`:
Output measurement name.
If not specified, it defaults to the value of `oid`. If `oid` is numeric, an attempt to translate the numeric OID into a texual OID will be made.
### Measurements & Fields:
* `inherit_tags`:
Which tags to inherit from the top-level config and to use in the output of this table's measurement.
With the last example (Table with both mapping and subtable example):
### MIB lookups
If the plugin is configured such that it needs to perform lookups from the MIB, it will use the net-snmp utilities `snmptranslate` and `snmptable`.
- ifHCOutOctets
- ifHCOutOctets
- ifInDiscards
- ifInDiscards
- ifHCInOctets
- ifHCInOctets
### Tags:
With the last example (Table with both mapping and subtable example):
- ifHCOutOctets
- host
- instance
- unit
- ifInDiscards
- host
- instance
- ifHCInOctets
- host
- instance
- unit
### Example Output:
With the last example (Table with both mapping and subtable example):
```
ifHCOutOctets,host=127.0.0.1,instance=enp5s0,unit=octets ifHCOutOctets=10565628i 1456878706044462901
ifInDiscards,host=127.0.0.1,instance=enp5s0 ifInDiscards=0i 1456878706044510264
ifHCInOctets,host=127.0.0.1,instance=enp5s0,unit=octets ifHCInOctets=76351777i 1456878706044531312
```
When performing the lookups, the plugin will load all available MIBs. If your MIB files are in a custom path, you may add the path using the `MIBDIRS` environment variable. See [`man 1 snmpcmd`](http://net-snmp.sourceforge.net/docs/man/snmpcmd.html#lbAK) for more information on the variable.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

17
plugins/inputs/snmp/testdata/snmpd.conf vendored Normal file
View File

@@ -0,0 +1,17 @@
# This config provides the data represented in the plugin documentation
# Requires net-snmp >= 5.7
#agentaddress UDP:127.0.0.1:1161
rocommunity public
override .1.0.0.0.1.1.0 octet_str "foo"
override .1.0.0.0.1.1.1 octet_str "bar"
override .1.0.0.0.1.102 octet_str "bad"
override .1.0.0.0.1.2.0 integer 1
override .1.0.0.0.1.2.1 integer 2
override .1.0.0.0.1.3.0 octet_str "0.123"
override .1.0.0.0.1.3.1 octet_str "0.456"
override .1.0.0.0.1.3.2 octet_str "9.999"
override .1.0.0.1.1 octet_str "baz"
override .1.0.0.1.2 uinteger 54321
override .1.0.0.1.3 uinteger 234

51
plugins/inputs/snmp/testdata/test.mib vendored Normal file
View File

@@ -0,0 +1,51 @@
TEST DEFINITIONS ::= BEGIN
testOID ::= { 1 0 0 }
testTable OBJECT-TYPE
SYNTAX SEQUENCE OF testTableEntry
MAX-ACCESS not-accessible
STATUS current
::= { testOID 0 }
testTableEntry OBJECT-TYPE
SYNTAX TestTableEntry
MAX-ACCESS not-accessible
STATUS current
INDEX {
server
}
::= { testTable 1 }
TestTableEntry ::=
SEQUENCE {
server OCTET STRING,
connections INTEGER,
latency OCTET STRING,
}
server OBJECT-TYPE
SYNTAX OCTET STRING
MAX-ACCESS read-only
STATUS current
::= { testTableEntry 1 }
connections OBJECT-TYPE
SYNTAX INTEGER
MAX-ACCESS read-only
STATUS current
::= { testTableEntry 2 }
latency OBJECT-TYPE
SYNTAX OCTET STRING
MAX-ACCESS read-only
STATUS current
::= { testTableEntry 3 }
hostname OBJECT-TYPE
SYNTAX OCTET STRING
MAX-ACCESS read-only
STATUS current
::= { testOID 1 1 }
END

View File

@@ -0,0 +1,549 @@
# SNMP Input Plugin
The SNMP input plugin gathers metrics from SNMP agents
### Configuration:
#### Very simple example
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.2.1.4.1`
```toml
# Very Simple Example
[[inputs.snmp]]
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Simple list of OIDs to get, in addition to "collect"
get_oids = [".1.3.6.1.2.1.2.2.1.4.1"]
```
#### Simple example
In this example, Telegraf gathers value of OIDS:
- named **ifnumber**
- named **interface_speed**
With **inputs.snmp.get** section the plugin gets the oid number:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed*
As you can see *ifSpeed* is not a valid OID. In order to get
the valid OID, the plugin uses `snmptranslate_file` to match the OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5`
Also as the plugin will append `instance` to the corresponding OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5.1`
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.1.0`
- `.1.3.6.1.2.1.2.2.1.5.1`
```toml
# Simple example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
collect = ["ifnumber", "interface_speed"]
[[inputs.snmp.get]]
name = "ifnumber"
oid = ".1.3.6.1.2.1.2.1.0"
[[inputs.snmp.get]]
name = "interface_speed"
oid = "ifSpeed"
instance = "1"
```
#### Simple bulk example
In this example, Telegraf gathers value of OIDS:
- named **ifnumber**
- named **interface_speed**
- named **if_out_octets**
With **inputs.snmp.get** section the plugin gets oid number:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed*
With **inputs.snmp.bulk** section the plugin gets the oid number:
- **if_out_octets** => *ifOutOctets*
As you can see *ifSpeed* and *ifOutOctets* are not a valid OID.
In order to get the valid OID, the plugin uses `snmptranslate_file`
to match the OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5`
- **if_out_octets** => *ifOutOctets* => `.1.3.6.1.2.1.2.2.1.16`
Also, the plugin will append `instance` to the corresponding OID:
- **ifnumber** => `.1.3.6.1.2.1.2.1.0`
- **interface_speed** => *ifSpeed* => `.1.3.6.1.2.1.2.2.1.5.1`
And **if_out_octets** is a bulk request, the plugin will gathers all
OIDS in the table.
- `.1.3.6.1.2.1.2.2.1.16.1`
- `.1.3.6.1.2.1.2.2.1.16.2`
- `.1.3.6.1.2.1.2.2.1.16.3`
- `.1.3.6.1.2.1.2.2.1.16.4`
- `.1.3.6.1.2.1.2.2.1.16.5`
- `...`
In this example, the plugin will gather value of OIDS:
- `.1.3.6.1.2.1.2.1.0`
- `.1.3.6.1.2.1.2.2.1.5.1`
- `.1.3.6.1.2.1.2.2.1.16.1`
- `.1.3.6.1.2.1.2.2.1.16.2`
- `.1.3.6.1.2.1.2.2.1.16.3`
- `.1.3.6.1.2.1.2.2.1.16.4`
- `.1.3.6.1.2.1.2.2.1.16.5`
- `...`
```toml
# Simple bulk example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
collect = ["interface_speed", "if_number", "if_out_octets"]
[[inputs.snmp.get]]
name = "interface_speed"
oid = "ifSpeed"
instance = "1"
[[inputs.snmp.get]]
name = "if_number"
oid = "ifNumber"
[[inputs.snmp.bulk]]
name = "if_out_octets"
oid = "ifOutOctets"
```
#### Table example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Note: This example is like a bulk request a but using an
other configuration
Telegraf gathers value of OIDS of the table:
- named **iftable1**
With **inputs.snmp.table** section the plugin gets oid number:
- **iftable1** => `.1.3.6.1.2.1.31.1.1.1`
Also **iftable1** is a table, the plugin will gathers all
OIDS in the table and in the subtables
- `.1.3.6.1.2.1.31.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.1....`
- `.1.3.6.1.2.1.31.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.2....`
- `.1.3.6.1.2.1.31.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.3....`
- `.1.3.6.1.2.1.31.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.4....`
- `.1.3.6.1.2.1.31.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.5....`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `...`
```toml
# Table example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which get/bulk do you want to collect for this host
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable1"
# table without mapping neither subtables
# This is like bulk request
[[inputs.snmp.table]]
name = "iftable1"
oid = ".1.3.6.1.2.1.31.1.1.1"
```
#### Table with subtable example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Note: This example is like a bulk request a but using an
other configuration
Telegraf gathers value of OIDS of the table:
- named **iftable2**
With **inputs.snmp.table** section *AND* **sub_tables** attribute,
the plugin will get OIDS from subtables:
- **iftable2** => `.1.3.6.1.2.1.2.2.1.13`
Also **iftable2** is a table, the plugin will gathers all
OIDS in subtables:
- `.1.3.6.1.2.1.2.2.1.13.1`
- `.1.3.6.1.2.1.2.2.1.13.2`
- `.1.3.6.1.2.1.2.2.1.13.3`
- `.1.3.6.1.2.1.2.2.1.13.4`
- `.1.3.6.1.2.1.2.2.1.13....`
```toml
# Table with subtable example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable2"
# table without mapping but with subtables
[[inputs.snmp.table]]
name = "iftable2"
sub_tables = [".1.3.6.1.2.1.2.2.1.13"]
# note
# oid attribute is useless
```
#### Table with mapping example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Telegraf gathers value of OIDS of the table:
- named **iftable3**
With **inputs.snmp.table** section the plugin gets oid number:
- **iftable3** => `.1.3.6.1.2.1.31.1.1.1`
Also **iftable2** is a table, the plugin will gathers all
OIDS in the table and in the subtables
- `.1.3.6.1.2.1.31.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.1....`
- `.1.3.6.1.2.1.31.1.1.1.2`
- `.1.3.6.1.2.1.31.1.1.1.2....`
- `.1.3.6.1.2.1.31.1.1.1.3`
- `.1.3.6.1.2.1.31.1.1.1.3....`
- `.1.3.6.1.2.1.31.1.1.1.4`
- `.1.3.6.1.2.1.31.1.1.1.4....`
- `.1.3.6.1.2.1.31.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.5....`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `...`
But the **include_instances** attribute will filter which OIDS
will be gathered; As you see, there is an other attribute, `mapping_table`.
`include_instances` and `mapping_table` permit to build a hash table
to filter only OIDS you want.
Let's say, we have the following data on SNMP server:
- OID: `.1.3.6.1.2.1.31.1.1.1.1.1` has as value: `enp5s0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.2` has as value: `enp5s1`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.3` has as value: `enp5s2`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.4` has as value: `eth0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.5` has as value: `eth1`
The plugin will build the following hash table:
| instance name | instance id |
|---------------|-------------|
| `enp5s0` | `1` |
| `enp5s1` | `2` |
| `enp5s2` | `3` |
| `eth0` | `4` |
| `eth1` | `5` |
With the **include_instances** attribute, the plugin will gather
the following OIDS:
- `.1.3.6.1.2.1.31.1.1.1.1.1`
- `.1.3.6.1.2.1.31.1.1.1.1.5`
- `.1.3.6.1.2.1.31.1.1.1.2.1`
- `.1.3.6.1.2.1.31.1.1.1.2.5`
- `.1.3.6.1.2.1.31.1.1.1.3.1`
- `.1.3.6.1.2.1.31.1.1.1.3.5`
- `.1.3.6.1.2.1.31.1.1.1.4.1`
- `.1.3.6.1.2.1.31.1.1.1.4.5`
- `.1.3.6.1.2.1.31.1.1.1.5.1`
- `.1.3.6.1.2.1.31.1.1.1.5.5`
- `.1.3.6.1.2.1.31.1.1.1.6.1`
- `.1.3.6.1.2.1.31.1.1.1.6.5`
- `...`
Note: the plugin will add instance name as tag *instance*
```toml
# Simple table with mapping example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable3"
include_instances = ["enp5s0", "eth1"]
# table with mapping but without subtables
[[inputs.snmp.table]]
name = "iftable3"
oid = ".1.3.6.1.2.1.31.1.1.1"
# if empty. get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty, get all subtables
```
#### Table with both mapping and subtable example
In this example, we remove collect attribute to the host section,
but you can still use it in combination of the following part.
Telegraf gathers value of OIDS of the table:
- named **iftable4**
With **inputs.snmp.table** section *AND* **sub_tables** attribute,
the plugin will get OIDS from subtables:
- **iftable4** => `.1.3.6.1.2.1.31.1.1.1`
Also **iftable2** is a table, the plugin will gathers all
OIDS in the table and in the subtables
- `.1.3.6.1.2.1.31.1.1.1.6.1
- `.1.3.6.1.2.1.31.1.1.1.6.2`
- `.1.3.6.1.2.1.31.1.1.1.6.3`
- `.1.3.6.1.2.1.31.1.1.1.6.4`
- `.1.3.6.1.2.1.31.1.1.1.6....`
- `.1.3.6.1.2.1.31.1.1.1.10.1`
- `.1.3.6.1.2.1.31.1.1.1.10.2`
- `.1.3.6.1.2.1.31.1.1.1.10.3`
- `.1.3.6.1.2.1.31.1.1.1.10.4`
- `.1.3.6.1.2.1.31.1.1.1.10....`
But the **include_instances** attribute will filter which OIDS
will be gathered; As you see, there is an other attribute, `mapping_table`.
`include_instances` and `mapping_table` permit to build a hash table
to filter only OIDS you want.
Let's say, we have the following data on SNMP server:
- OID: `.1.3.6.1.2.1.31.1.1.1.1.1` has as value: `enp5s0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.2` has as value: `enp5s1`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.3` has as value: `enp5s2`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.4` has as value: `eth0`
- OID: `.1.3.6.1.2.1.31.1.1.1.1.5` has as value: `eth1`
The plugin will build the following hash table:
| instance name | instance id |
|---------------|-------------|
| `enp5s0` | `1` |
| `enp5s1` | `2` |
| `enp5s2` | `3` |
| `eth0` | `4` |
| `eth1` | `5` |
With the **include_instances** attribute, the plugin will gather
the following OIDS:
- `.1.3.6.1.2.1.31.1.1.1.6.1`
- `.1.3.6.1.2.1.31.1.1.1.6.5`
- `.1.3.6.1.2.1.31.1.1.1.10.1`
- `.1.3.6.1.2.1.31.1.1.1.10.5`
Note: the plugin will add instance name as tag *instance*
```toml
# Table with both mapping and subtable example
[[inputs.snmp]]
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "127.0.0.1:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# Which table do you want to collect
[[inputs.snmp.host.table]]
name = "iftable4"
include_instances = ["enp5s0", "eth1"]
# table with both mapping and subtables
[[inputs.snmp.table]]
name = "iftable4"
# if empty get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty get all subtables
# sub_tables could be not "real subtables"
sub_tables=[".1.3.6.1.2.1.2.2.1.13", "bytes_recv", "bytes_send"]
# note
# oid attribute is useless
# SNMP SUBTABLES
[[inputs.snmp.subtable]]
name = "bytes_recv"
oid = ".1.3.6.1.2.1.31.1.1.1.6"
unit = "octets"
[[inputs.snmp.subtable]]
name = "bytes_send"
oid = ".1.3.6.1.2.1.31.1.1.1.10"
unit = "octets"
```
#### Configuration notes
- In **inputs.snmp.table** section, the `oid` attribute is useless if
the `sub_tables` attributes is defined
- In **inputs.snmp.subtable** section, you can put a name from `snmptranslate_file`
as `oid` attribute instead of a valid OID
### Measurements & Fields:
With the last example (Table with both mapping and subtable example):
- ifHCOutOctets
- ifHCOutOctets
- ifInDiscards
- ifInDiscards
- ifHCInOctets
- ifHCInOctets
### Tags:
With the last example (Table with both mapping and subtable example):
- ifHCOutOctets
- host
- instance
- unit
- ifInDiscards
- host
- instance
- ifHCInOctets
- host
- instance
- unit
### Example Output:
With the last example (Table with both mapping and subtable example):
```
ifHCOutOctets,host=127.0.0.1,instance=enp5s0,unit=octets ifHCOutOctets=10565628i 1456878706044462901
ifInDiscards,host=127.0.0.1,instance=enp5s0 ifInDiscards=0i 1456878706044510264
ifHCInOctets,host=127.0.0.1,instance=enp5s0,unit=octets ifHCInOctets=76351777i 1456878706044531312
```

View File

@@ -0,0 +1,818 @@
package snmp_legacy
import (
"io/ioutil"
"log"
"net"
"strconv"
"strings"
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
"github.com/soniah/gosnmp"
)
// Snmp is a snmp plugin
type Snmp struct {
Host []Host
Get []Data
Bulk []Data
Table []Table
Subtable []Subtable
SnmptranslateFile string
nameToOid map[string]string
initNode Node
subTableMap map[string]Subtable
}
type Host struct {
Address string
Community string
// SNMP version. Default 2
Version int
// SNMP timeout, in seconds. 0 means no timeout
Timeout float64
// SNMP retries
Retries int
// Data to collect (list of Data names)
Collect []string
// easy get oids
GetOids []string
// Table
Table []HostTable
// Oids
getOids []Data
bulkOids []Data
tables []HostTable
// array of processed oids
// to skip oid duplication
processedOids []string
OidInstanceMapping map[string]map[string]string
}
type Table struct {
// name = "iftable"
Name string
// oid = ".1.3.6.1.2.1.31.1.1.1"
Oid string
//if empty get all instances
//mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
MappingTable string
// if empty get all subtables
// sub_tables could be not "real subtables"
//sub_tables=[".1.3.6.1.2.1.2.2.1.13", "bytes_recv", "bytes_send"]
SubTables []string
}
type HostTable struct {
// name = "iftable"
Name string
// Includes only these instances
// include_instances = ["eth0", "eth1"]
IncludeInstances []string
// Excludes only these instances
// exclude_instances = ["eth20", "eth21"]
ExcludeInstances []string
// From Table struct
oid string
mappingTable string
subTables []string
}
// TODO find better names
type Subtable struct {
//name = "bytes_send"
Name string
//oid = ".1.3.6.1.2.1.31.1.1.1.10"
Oid string
//unit = "octets"
Unit string
}
type Data struct {
Name string
// OID (could be numbers or name)
Oid string
// Unit
Unit string
// SNMP getbulk max repetition
MaxRepetition uint8 `toml:"max_repetition"`
// SNMP Instance (default 0)
// (only used with GET request and if
// OID is a name from snmptranslate file)
Instance string
// OID (only number) (used for computation)
rawOid string
}
type Node struct {
id string
name string
subnodes map[string]Node
}
var sampleConfig = `
## Use 'oids.txt' file to translate oids to names
## To generate 'oids.txt' you need to run:
## snmptranslate -m all -Tz -On | sed -e 's/"//g' > /tmp/oids.txt
## Or if you have an other MIB folder with custom MIBs
## snmptranslate -M /mycustommibfolder -Tz -On -m all | sed -e 's/"//g' > oids.txt
snmptranslate_file = "/tmp/oids.txt"
[[inputs.snmp.host]]
address = "192.168.2.2:161"
# SNMP community
community = "public" # default public
# SNMP version (1, 2 or 3)
# Version 3 not supported yet
version = 2 # default 2
# SNMP response timeout
timeout = 2.0 # default 2.0
# SNMP request retries
retries = 2 # default 2
# Which get/bulk do you want to collect for this host
collect = ["mybulk", "sysservices", "sysdescr"]
# Simple list of OIDs to get, in addition to "collect"
get_oids = []
[[inputs.snmp.host]]
address = "192.168.2.3:161"
community = "public"
version = 2
timeout = 2.0
retries = 2
collect = ["mybulk"]
get_oids = [
"ifNumber",
".1.3.6.1.2.1.1.3.0",
]
[[inputs.snmp.get]]
name = "ifnumber"
oid = "ifNumber"
[[inputs.snmp.get]]
name = "interface_speed"
oid = "ifSpeed"
instance = "0"
[[inputs.snmp.get]]
name = "sysuptime"
oid = ".1.3.6.1.2.1.1.3.0"
unit = "second"
[[inputs.snmp.bulk]]
name = "mybulk"
max_repetition = 127
oid = ".1.3.6.1.2.1.1"
[[inputs.snmp.bulk]]
name = "ifoutoctets"
max_repetition = 127
oid = "ifOutOctets"
[[inputs.snmp.host]]
address = "192.168.2.13:161"
#address = "127.0.0.1:161"
community = "public"
version = 2
timeout = 2.0
retries = 2
#collect = ["mybulk", "sysservices", "sysdescr", "systype"]
collect = ["sysuptime" ]
[[inputs.snmp.host.table]]
name = "iftable3"
include_instances = ["enp5s0", "eth1"]
# SNMP TABLEs
# table without mapping neither subtables
[[inputs.snmp.table]]
name = "iftable1"
oid = ".1.3.6.1.2.1.31.1.1.1"
# table without mapping but with subtables
[[inputs.snmp.table]]
name = "iftable2"
oid = ".1.3.6.1.2.1.31.1.1.1"
sub_tables = [".1.3.6.1.2.1.2.2.1.13"]
# table with mapping but without subtables
[[inputs.snmp.table]]
name = "iftable3"
oid = ".1.3.6.1.2.1.31.1.1.1"
# if empty. get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty, get all subtables
# table with both mapping and subtables
[[inputs.snmp.table]]
name = "iftable4"
oid = ".1.3.6.1.2.1.31.1.1.1"
# if empty get all instances
mapping_table = ".1.3.6.1.2.1.31.1.1.1.1"
# if empty get all subtables
# sub_tables could be not "real subtables"
sub_tables=[".1.3.6.1.2.1.2.2.1.13", "bytes_recv", "bytes_send"]
`
// SampleConfig returns sample configuration message
func (s *Snmp) SampleConfig() string {
return sampleConfig
}
// Description returns description of Zookeeper plugin
func (s *Snmp) Description() string {
return `DEPRECATED! PLEASE USE inputs.snmp INSTEAD.`
}
func fillnode(parentNode Node, oid_name string, ids []string) {
// ids = ["1", "3", "6", ...]
id, ids := ids[0], ids[1:]
node, ok := parentNode.subnodes[id]
if ok == false {
node = Node{
id: id,
name: "",
subnodes: make(map[string]Node),
}
if len(ids) == 0 {
node.name = oid_name
}
parentNode.subnodes[id] = node
}
if len(ids) > 0 {
fillnode(node, oid_name, ids)
}
}
func findnodename(node Node, ids []string) (string, string) {
// ids = ["1", "3", "6", ...]
if len(ids) == 1 {
return node.name, ids[0]
}
id, ids := ids[0], ids[1:]
// Get node
subnode, ok := node.subnodes[id]
if ok {
return findnodename(subnode, ids)
}
// We got a node
// Get node name
if node.name != "" && len(ids) == 0 && id == "0" {
// node with instance 0
return node.name, "0"
} else if node.name != "" && len(ids) == 0 && id != "0" {
// node with an instance
return node.name, string(id)
} else if node.name != "" && len(ids) > 0 {
// node with subinstances
return node.name, strings.Join(ids, ".")
}
// return an empty node name
return node.name, ""
}
func (s *Snmp) Gather(acc telegraf.Accumulator) error {
// TODO put this in cache on first run
// Create subtables mapping
if len(s.subTableMap) == 0 {
s.subTableMap = make(map[string]Subtable)
for _, sb := range s.Subtable {
s.subTableMap[sb.Name] = sb
}
}
// TODO put this in cache on first run
// Create oid tree
if s.SnmptranslateFile != "" && len(s.initNode.subnodes) == 0 {
s.nameToOid = make(map[string]string)
s.initNode = Node{
id: "1",
name: "",
subnodes: make(map[string]Node),
}
data, err := ioutil.ReadFile(s.SnmptranslateFile)
if err != nil {
log.Printf("Reading SNMPtranslate file error: %s", err)
return err
} else {
for _, line := range strings.Split(string(data), "\n") {
oids := strings.Fields(string(line))
if len(oids) == 2 && oids[1] != "" {
oid_name := oids[0]
oid := oids[1]
fillnode(s.initNode, oid_name, strings.Split(string(oid), "."))
s.nameToOid[oid_name] = oid
}
}
}
}
// Fetching data
for _, host := range s.Host {
// Set default args
if len(host.Address) == 0 {
host.Address = "127.0.0.1:161"
}
if host.Community == "" {
host.Community = "public"
}
if host.Timeout <= 0 {
host.Timeout = 2.0
}
if host.Retries <= 0 {
host.Retries = 2
}
// Prepare host
// Get Easy GET oids
for _, oidstring := range host.GetOids {
oid := Data{}
if val, ok := s.nameToOid[oidstring]; ok {
// TODO should we add the 0 instance ?
oid.Name = oidstring
oid.Oid = val
oid.rawOid = "." + val + ".0"
} else {
oid.Name = oidstring
oid.Oid = oidstring
if string(oidstring[:1]) != "." {
oid.rawOid = "." + oidstring
} else {
oid.rawOid = oidstring
}
}
host.getOids = append(host.getOids, oid)
}
for _, oid_name := range host.Collect {
// Get GET oids
for _, oid := range s.Get {
if oid.Name == oid_name {
if val, ok := s.nameToOid[oid.Oid]; ok {
// TODO should we add the 0 instance ?
if oid.Instance != "" {
oid.rawOid = "." + val + "." + oid.Instance
} else {
oid.rawOid = "." + val + ".0"
}
} else {
oid.rawOid = oid.Oid
}
host.getOids = append(host.getOids, oid)
}
}
// Get GETBULK oids
for _, oid := range s.Bulk {
if oid.Name == oid_name {
if val, ok := s.nameToOid[oid.Oid]; ok {
oid.rawOid = "." + val
} else {
oid.rawOid = oid.Oid
}
host.bulkOids = append(host.bulkOids, oid)
}
}
}
// Table
for _, hostTable := range host.Table {
for _, snmpTable := range s.Table {
if hostTable.Name == snmpTable.Name {
table := hostTable
table.oid = snmpTable.Oid
table.mappingTable = snmpTable.MappingTable
table.subTables = snmpTable.SubTables
host.tables = append(host.tables, table)
}
}
}
// Launch Mapping
// TODO put this in cache on first run
// TODO save mapping and computed oids
// to do it only the first time
// only if len(s.OidInstanceMapping) == 0
if len(host.OidInstanceMapping) >= 0 {
if err := host.SNMPMap(acc, s.nameToOid, s.subTableMap); err != nil {
log.Printf("SNMP Mapping error for host '%s': %s", host.Address, err)
continue
}
}
// Launch Get requests
if err := host.SNMPGet(acc, s.initNode); err != nil {
log.Printf("SNMP Error for host '%s': %s", host.Address, err)
}
if err := host.SNMPBulk(acc, s.initNode); err != nil {
log.Printf("SNMP Error for host '%s': %s", host.Address, err)
}
}
return nil
}
func (h *Host) SNMPMap(
acc telegraf.Accumulator,
nameToOid map[string]string,
subTableMap map[string]Subtable,
) error {
if h.OidInstanceMapping == nil {
h.OidInstanceMapping = make(map[string]map[string]string)
}
// Get snmp client
snmpClient, err := h.GetSNMPClient()
if err != nil {
return err
}
// Deconnection
defer snmpClient.Conn.Close()
// Prepare OIDs
for _, table := range h.tables {
// We don't have mapping
if table.mappingTable == "" {
if len(table.subTables) == 0 {
// If We don't have mapping table
// neither subtables list
// This is just a bulk request
oid := Data{}
oid.Oid = table.oid
if val, ok := nameToOid[oid.Oid]; ok {
oid.rawOid = "." + val
} else {
oid.rawOid = oid.Oid
}
h.bulkOids = append(h.bulkOids, oid)
} else {
// If We don't have mapping table
// but we have subtables
// This is a bunch of bulk requests
// For each subtable ...
for _, sb := range table.subTables {
// ... we create a new Data (oid) object
oid := Data{}
// Looking for more information about this subtable
ssb, exists := subTableMap[sb]
if exists {
// We found a subtable section in config files
oid.Oid = ssb.Oid
oid.rawOid = ssb.Oid
oid.Unit = ssb.Unit
} else {
// We did NOT find a subtable section in config files
oid.Oid = sb
oid.rawOid = sb
}
// TODO check oid validity
// Add the new oid to getOids list
h.bulkOids = append(h.bulkOids, oid)
}
}
} else {
// We have a mapping table
// We need to query this table
// To get mapping between instance id
// and instance name
oid_asked := table.mappingTable
oid_next := oid_asked
need_more_requests := true
// Set max repetition
maxRepetition := uint8(32)
// Launch requests
for need_more_requests {
// Launch request
result, err3 := snmpClient.GetBulk([]string{oid_next}, 0, maxRepetition)
if err3 != nil {
return err3
}
lastOid := ""
for _, variable := range result.Variables {
lastOid = variable.Name
if strings.HasPrefix(variable.Name, oid_asked) {
switch variable.Type {
// handle instance names
case gosnmp.OctetString:
// Check if instance is in includes instances
getInstances := true
if len(table.IncludeInstances) > 0 {
getInstances = false
for _, instance := range table.IncludeInstances {
if instance == string(variable.Value.([]byte)) {
getInstances = true
}
}
}
// Check if instance is in excludes instances
if len(table.ExcludeInstances) > 0 {
getInstances = true
for _, instance := range table.ExcludeInstances {
if instance == string(variable.Value.([]byte)) {
getInstances = false
}
}
}
// We don't want this instance
if !getInstances {
continue
}
// remove oid table from the complete oid
// in order to get the current instance id
key := strings.Replace(variable.Name, oid_asked, "", 1)
if len(table.subTables) == 0 {
// We have a mapping table
// but no subtables
// This is just a bulk request
// Building mapping table
mapping := map[string]string{strings.Trim(key, "."): string(variable.Value.([]byte))}
_, exists := h.OidInstanceMapping[table.oid]
if exists {
h.OidInstanceMapping[table.oid][strings.Trim(key, ".")] = string(variable.Value.([]byte))
} else {
h.OidInstanceMapping[table.oid] = mapping
}
// Add table oid in bulk oid list
oid := Data{}
oid.Oid = table.oid
if val, ok := nameToOid[oid.Oid]; ok {
oid.rawOid = "." + val
} else {
oid.rawOid = oid.Oid
}
h.bulkOids = append(h.bulkOids, oid)
} else {
// We have a mapping table
// and some subtables
// This is a bunch of get requests
// This is the best case :)
// For each subtable ...
for _, sb := range table.subTables {
// ... we create a new Data (oid) object
oid := Data{}
// Looking for more information about this subtable
ssb, exists := subTableMap[sb]
if exists {
// We found a subtable section in config files
oid.Oid = ssb.Oid + key
oid.rawOid = ssb.Oid + key
oid.Unit = ssb.Unit
oid.Instance = string(variable.Value.([]byte))
} else {
// We did NOT find a subtable section in config files
oid.Oid = sb + key
oid.rawOid = sb + key
oid.Instance = string(variable.Value.([]byte))
}
// TODO check oid validity
// Add the new oid to getOids list
h.getOids = append(h.getOids, oid)
}
}
default:
}
} else {
break
}
}
// Determine if we need more requests
if strings.HasPrefix(lastOid, oid_asked) {
need_more_requests = true
oid_next = lastOid
} else {
need_more_requests = false
}
}
}
}
// Mapping finished
// Create newoids based on mapping
return nil
}
func (h *Host) SNMPGet(acc telegraf.Accumulator, initNode Node) error {
// Get snmp client
snmpClient, err := h.GetSNMPClient()
if err != nil {
return err
}
// Deconnection
defer snmpClient.Conn.Close()
// Prepare OIDs
oidsList := make(map[string]Data)
for _, oid := range h.getOids {
oidsList[oid.rawOid] = oid
}
oidsNameList := make([]string, 0, len(oidsList))
for _, oid := range oidsList {
oidsNameList = append(oidsNameList, oid.rawOid)
}
// gosnmp.MAX_OIDS == 60
// TODO use gosnmp.MAX_OIDS instead of hard coded value
max_oids := 60
// limit 60 (MAX_OIDS) oids by requests
for i := 0; i < len(oidsList); i = i + max_oids {
// Launch request
max_index := i + max_oids
if i+max_oids > len(oidsList) {
max_index = len(oidsList)
}
result, err3 := snmpClient.Get(oidsNameList[i:max_index]) // Get() accepts up to g.MAX_OIDS
if err3 != nil {
return err3
}
// Handle response
_, err = h.HandleResponse(oidsList, result, acc, initNode)
if err != nil {
return err
}
}
return nil
}
func (h *Host) SNMPBulk(acc telegraf.Accumulator, initNode Node) error {
// Get snmp client
snmpClient, err := h.GetSNMPClient()
if err != nil {
return err
}
// Deconnection
defer snmpClient.Conn.Close()
// Prepare OIDs
oidsList := make(map[string]Data)
for _, oid := range h.bulkOids {
oidsList[oid.rawOid] = oid
}
oidsNameList := make([]string, 0, len(oidsList))
for _, oid := range oidsList {
oidsNameList = append(oidsNameList, oid.rawOid)
}
// TODO Trying to make requests with more than one OID
// to reduce the number of requests
for _, oid := range oidsNameList {
oid_asked := oid
need_more_requests := true
// Set max repetition
maxRepetition := oidsList[oid].MaxRepetition
if maxRepetition <= 0 {
maxRepetition = 32
}
// Launch requests
for need_more_requests {
// Launch request
result, err3 := snmpClient.GetBulk([]string{oid}, 0, maxRepetition)
if err3 != nil {
return err3
}
// Handle response
last_oid, err := h.HandleResponse(oidsList, result, acc, initNode)
if err != nil {
return err
}
// Determine if we need more requests
if strings.HasPrefix(last_oid, oid_asked) {
need_more_requests = true
oid = last_oid
} else {
need_more_requests = false
}
}
}
return nil
}
func (h *Host) GetSNMPClient() (*gosnmp.GoSNMP, error) {
// Prepare Version
var version gosnmp.SnmpVersion
if h.Version == 1 {
version = gosnmp.Version1
} else if h.Version == 3 {
version = gosnmp.Version3
} else {
version = gosnmp.Version2c
}
// Prepare host and port
host, port_str, err := net.SplitHostPort(h.Address)
if err != nil {
port_str = string("161")
}
// convert port_str to port in uint16
port_64, err := strconv.ParseUint(port_str, 10, 16)
port := uint16(port_64)
// Get SNMP client
snmpClient := &gosnmp.GoSNMP{
Target: host,
Port: port,
Community: h.Community,
Version: version,
Timeout: time.Duration(h.Timeout) * time.Second,
Retries: h.Retries,
}
// Connection
err2 := snmpClient.Connect()
if err2 != nil {
return nil, err2
}
// Return snmpClient
return snmpClient, nil
}
func (h *Host) HandleResponse(
oids map[string]Data,
result *gosnmp.SnmpPacket,
acc telegraf.Accumulator,
initNode Node,
) (string, error) {
var lastOid string
for _, variable := range result.Variables {
lastOid = variable.Name
nextresult:
// Get only oid wanted
for oid_key, oid := range oids {
// Skip oids already processed
for _, processedOid := range h.processedOids {
if variable.Name == processedOid {
break nextresult
}
}
// If variable.Name is the same as oid_key
// OR
// the result is SNMP table which "." comes right after oid_key.
// ex: oid_key: .1.3.6.1.2.1.2.2.1.16, variable.Name: .1.3.6.1.2.1.2.2.1.16.1
if variable.Name == oid_key || strings.HasPrefix(variable.Name, oid_key+".") {
switch variable.Type {
// handle Metrics
case gosnmp.Boolean, gosnmp.Integer, gosnmp.Counter32, gosnmp.Gauge32,
gosnmp.TimeTicks, gosnmp.Counter64, gosnmp.Uinteger32, gosnmp.OctetString:
// Prepare tags
tags := make(map[string]string)
if oid.Unit != "" {
tags["unit"] = oid.Unit
}
// Get name and instance
var oid_name string
var instance string
// Get oidname and instance from translate file
oid_name, instance = findnodename(initNode,
strings.Split(string(variable.Name[1:]), "."))
// Set instance tag
// From mapping table
mapping, inMappingNoSubTable := h.OidInstanceMapping[oid_key]
if inMappingNoSubTable {
// filter if the instance in not in
// OidInstanceMapping mapping map
if instance_name, exists := mapping[instance]; exists {
tags["instance"] = instance_name
} else {
continue
}
} else if oid.Instance != "" {
// From config files
tags["instance"] = oid.Instance
} else if instance != "" {
// Using last id of the current oid, ie:
// with .1.3.6.1.2.1.31.1.1.1.10.3
// instance is 3
tags["instance"] = instance
}
// Set name
var field_name string
if oid_name != "" {
// Set fieldname as oid name from translate file
field_name = oid_name
} else {
// Set fieldname as oid name from inputs.snmp.get section
// Because the result oid is equal to inputs.snmp.get section
field_name = oid.Name
}
tags["snmp_host"], _, _ = net.SplitHostPort(h.Address)
fields := make(map[string]interface{})
fields[string(field_name)] = variable.Value
h.processedOids = append(h.processedOids, variable.Name)
acc.AddFields(field_name, fields, tags)
case gosnmp.NoSuchObject, gosnmp.NoSuchInstance:
// Oid not found
log.Printf("[snmp input] Oid not found: %s", oid_key)
default:
// delete other data
}
break
}
}
}
return lastOid, nil
}
func init() {
inputs.Add("snmp_legacy", func() telegraf.Input {
return &Snmp{}
})
}

View File

@@ -0,0 +1,482 @@
package snmp_legacy
import (
"testing"
"github.com/influxdata/telegraf/testutil"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestSNMPErrorGet1(t *testing.T) {
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: ".1.3.6.1.2.1.2.2.1.16.1",
}
h := Host{
Collect: []string{"oid1"},
}
s := Snmp{
SnmptranslateFile: "bad_oid.txt",
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.Error(t, err)
}
func TestSNMPErrorGet2(t *testing.T) {
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: ".1.3.6.1.2.1.2.2.1.16.1",
}
h := Host{
Collect: []string{"oid1"},
}
s := Snmp{
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
assert.Equal(t, 0, len(acc.Metrics))
}
func TestSNMPErrorBulk(t *testing.T) {
bulk1 := Data{
Name: "oid1",
Unit: "octets",
Oid: ".1.3.6.1.2.1.2.2.1.16",
}
h := Host{
Address: testutil.GetLocalHost(),
Collect: []string{"oid1"},
}
s := Snmp{
Host: []Host{h},
Bulk: []Data{bulk1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
assert.Equal(t, 0, len(acc.Metrics))
}
func TestSNMPGet1(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: ".1.3.6.1.2.1.2.2.1.16.1",
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
}
s := Snmp{
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"oid1",
map[string]interface{}{
"oid1": uint(543846),
},
map[string]string{
"unit": "octets",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPGet2(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
get1 := Data{
Name: "oid1",
Oid: "ifNumber",
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifNumber",
map[string]interface{}{
"ifNumber": int(4),
},
map[string]string{
"instance": "0",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPGet3(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: "ifSpeed",
Instance: "1",
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifSpeed",
map[string]interface{}{
"ifSpeed": uint(10000000),
},
map[string]string{
"unit": "octets",
"instance": "1",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPEasyGet4(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: "ifSpeed",
Instance: "1",
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
GetOids: []string{"ifNumber"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifSpeed",
map[string]interface{}{
"ifSpeed": uint(10000000),
},
map[string]string{
"unit": "octets",
"instance": "1",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifNumber",
map[string]interface{}{
"ifNumber": int(4),
},
map[string]string{
"instance": "0",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPEasyGet5(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
get1 := Data{
Name: "oid1",
Unit: "octets",
Oid: "ifSpeed",
Instance: "1",
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
GetOids: []string{".1.3.6.1.2.1.2.1.0"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Get: []Data{get1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifSpeed",
map[string]interface{}{
"ifSpeed": uint(10000000),
},
map[string]string{
"unit": "octets",
"instance": "1",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifNumber",
map[string]interface{}{
"ifNumber": int(4),
},
map[string]string{
"instance": "0",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPEasyGet6(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
GetOids: []string{"1.3.6.1.2.1.2.1.0"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifNumber",
map[string]interface{}{
"ifNumber": int(4),
},
map[string]string{
"instance": "0",
"snmp_host": testutil.GetLocalHost(),
},
)
}
func TestSNMPBulk1(t *testing.T) {
if testing.Short() {
t.Skip("Skipping integration test in short mode")
}
bulk1 := Data{
Name: "oid1",
Unit: "octets",
Oid: ".1.3.6.1.2.1.2.2.1.16",
MaxRepetition: 2,
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Bulk: []Data{bulk1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(543846),
},
map[string]string{
"unit": "octets",
"instance": "1",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(26475179),
},
map[string]string{
"unit": "octets",
"instance": "2",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(108963968),
},
map[string]string{
"unit": "octets",
"instance": "3",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(12991453),
},
map[string]string{
"unit": "octets",
"instance": "36",
"snmp_host": testutil.GetLocalHost(),
},
)
}
// TODO find why, if this test is active
// Circle CI stops with the following error...
// bash scripts/circle-test.sh died unexpectedly
// Maybe the test is too long ??
func dTestSNMPBulk2(t *testing.T) {
bulk1 := Data{
Name: "oid1",
Unit: "octets",
Oid: "ifOutOctets",
MaxRepetition: 2,
}
h := Host{
Address: testutil.GetLocalHost() + ":31161",
Community: "telegraf",
Version: 2,
Timeout: 2.0,
Retries: 2,
Collect: []string{"oid1"},
}
s := Snmp{
SnmptranslateFile: "./testdata/oids.txt",
Host: []Host{h},
Bulk: []Data{bulk1},
}
var acc testutil.Accumulator
err := s.Gather(&acc)
require.NoError(t, err)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(543846),
},
map[string]string{
"unit": "octets",
"instance": "1",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(26475179),
},
map[string]string{
"unit": "octets",
"instance": "2",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(108963968),
},
map[string]string{
"unit": "octets",
"instance": "3",
"snmp_host": testutil.GetLocalHost(),
},
)
acc.AssertContainsTaggedFields(t,
"ifOutOctets",
map[string]interface{}{
"ifOutOctets": uint(12991453),
},
map[string]string{
"unit": "octets",
"instance": "36",
"snmp_host": testutil.GetLocalHost(),
},
)
}

View File

@@ -24,7 +24,8 @@ const (
defaultFieldName = "value"
defaultSeparator = "_"
defaultSeparator = "_"
defaultAllowPendingMessage = 10000
)
var dropwarn = "ERROR: statsd message queue full. " +
@@ -295,7 +296,7 @@ func (s *Statsd) udpListen() error {
case s.in <- bufCopy:
default:
s.drops++
if s.drops == 1 || s.drops%s.AllowedPendingMessages == 0 {
if s.drops == 1 || s.AllowedPendingMessages == 0 || s.drops%s.AllowedPendingMessages == 0 {
log.Printf(dropwarn, s.drops)
}
}
@@ -640,7 +641,8 @@ func (s *Statsd) Stop() {
func init() {
inputs.Add("statsd", func() telegraf.Input {
return &Statsd{
MetricSeparator: "_",
MetricSeparator: "_",
AllowedPendingMessages: defaultAllowPendingMessage,
}
})
}

View File

@@ -17,6 +17,8 @@ func TestTailFromBeginning(t *testing.T) {
tmpfile, err := ioutil.TempFile("", "")
require.NoError(t, err)
defer os.Remove(tmpfile.Name())
_, err = tmpfile.WriteString("cpu,mytag=foo usage_idle=100\n")
require.NoError(t, err)
tt := NewTail()
tt.FromBeginning = true
@@ -28,12 +30,10 @@ func TestTailFromBeginning(t *testing.T) {
acc := testutil.Accumulator{}
require.NoError(t, tt.Start(&acc))
_, err = tmpfile.WriteString("cpu,mytag=foo usage_idle=100\n")
require.NoError(t, err)
time.Sleep(time.Millisecond * 100)
require.NoError(t, tt.Gather(&acc))
// arbitrary sleep to wait for message to show up
time.Sleep(time.Millisecond * 250)
time.Sleep(time.Millisecond * 150)
acc.AssertContainsTaggedFields(t, "cpu",
map[string]interface{}{

View File

@@ -158,7 +158,6 @@ func (t *TcpListener) tcpListen() error {
if err != nil {
return err
}
// log.Printf("Received TCP Connection from %s", conn.RemoteAddr())
select {
case <-t.accept:
@@ -194,7 +193,6 @@ func (t *TcpListener) handler(conn *net.TCPConn, id string) {
defer func() {
t.wg.Done()
conn.Close()
// log.Printf("Closed TCP Connection from %s", conn.RemoteAddr())
// Add one connection potential back to channel when this one closes
t.accept <- true
t.forget(id)
@@ -239,14 +237,19 @@ func (t *TcpListener) tcpParser() error {
for {
select {
case <-t.done:
return nil
// drain input packets before finishing:
if len(t.in) == 0 {
return nil
}
case packet = <-t.in:
if len(packet) == 0 {
continue
}
metrics, err = t.parser.Parse(packet)
if err == nil {
t.storeMetrics(metrics)
for _, m := range metrics {
t.acc.AddFields(m.Name(), m.Fields(), m.Tags(), m.Time())
}
} else {
t.malformed++
if t.malformed == 1 || t.malformed%1000 == 0 {
@@ -257,15 +260,6 @@ func (t *TcpListener) tcpParser() error {
}
}
func (t *TcpListener) storeMetrics(metrics []telegraf.Metric) error {
t.Lock()
defer t.Unlock()
for _, m := range metrics {
t.acc.AddFields(m.Name(), m.Fields(), m.Tags(), m.Time())
}
return nil
}
// forget a TCP connection
func (t *TcpListener) forget(id string) {
t.cleanup.Lock()

View File

@@ -37,6 +37,62 @@ func newTestTcpListener() (*TcpListener, chan []byte) {
return listener, in
}
// benchmark how long it takes to accept & process 100,000 metrics:
func BenchmarkTCP(b *testing.B) {
listener := TcpListener{
ServiceAddress: ":8198",
AllowedPendingMessages: 100000,
MaxTCPConnections: 250,
}
listener.parser, _ = parsers.NewInfluxParser()
acc := &testutil.Accumulator{Discard: true}
// send multiple messages to socket
for n := 0; n < b.N; n++ {
err := listener.Start(acc)
if err != nil {
panic(err)
}
time.Sleep(time.Millisecond * 25)
conn, err := net.Dial("tcp", "127.0.0.1:8198")
if err != nil {
panic(err)
}
for i := 0; i < 100000; i++ {
fmt.Fprintf(conn, testMsg)
}
// wait for 100,000 metrics to get added to accumulator
time.Sleep(time.Millisecond)
listener.Stop()
}
}
func TestHighTrafficTCP(t *testing.T) {
listener := TcpListener{
ServiceAddress: ":8199",
AllowedPendingMessages: 100000,
MaxTCPConnections: 250,
}
listener.parser, _ = parsers.NewInfluxParser()
acc := &testutil.Accumulator{}
// send multiple messages to socket
err := listener.Start(acc)
require.NoError(t, err)
time.Sleep(time.Millisecond * 25)
conn, err := net.Dial("tcp", "127.0.0.1:8199")
require.NoError(t, err)
for i := 0; i < 100000; i++ {
fmt.Fprintf(conn, testMsg)
}
time.Sleep(time.Millisecond)
listener.Stop()
assert.Equal(t, 100000, len(acc.Metrics))
}
func TestConnectTCP(t *testing.T) {
listener := TcpListener{
ServiceAddress: ":8194",

View File

@@ -3,8 +3,8 @@ package udp_listener
import (
"log"
"net"
"strings"
"sync"
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
@@ -99,9 +99,11 @@ func (u *UdpListener) Start(acc telegraf.Accumulator) error {
}
func (u *UdpListener) Stop() {
u.Lock()
defer u.Unlock()
close(u.done)
u.listener.Close()
u.wg.Wait()
u.listener.Close()
close(u.in)
log.Println("Stopped UDP listener service on ", u.ServiceAddress)
}
@@ -122,9 +124,13 @@ func (u *UdpListener) udpListen() error {
case <-u.done:
return nil
default:
u.listener.SetReadDeadline(time.Now().Add(time.Second))
n, _, err := u.listener.ReadFromUDP(buf)
if err != nil && !strings.Contains(err.Error(), "closed network") {
log.Printf("ERROR: %s\n", err.Error())
if err != nil {
if err, ok := err.(net.Error); ok && err.Timeout() {
} else {
log.Printf("ERROR: %s\n", err.Error())
}
continue
}
bufCopy := make([]byte, n)
@@ -151,11 +157,15 @@ func (u *UdpListener) udpParser() error {
for {
select {
case <-u.done:
return nil
if len(u.in) == 0 {
return nil
}
case packet = <-u.in:
metrics, err = u.parser.Parse(packet)
if err == nil {
u.storeMetrics(metrics)
for _, m := range metrics {
u.acc.AddFields(m.Name(), m.Fields(), m.Tags(), m.Time())
}
} else {
u.malformed++
if u.malformed == 1 || u.malformed%1000 == 0 {
@@ -166,15 +176,6 @@ func (u *UdpListener) udpParser() error {
}
}
func (u *UdpListener) storeMetrics(metrics []telegraf.Metric) error {
u.Lock()
defer u.Unlock()
for _, m := range metrics {
u.acc.AddFields(m.Name(), m.Fields(), m.Tags(), m.Time())
}
return nil
}
func init() {
inputs.Add("udp_listener", func() telegraf.Input {
return &UdpListener{}

View File

@@ -1,20 +1,36 @@
package udp_listener
import (
"fmt"
"io/ioutil"
"log"
"net"
"testing"
"time"
"github.com/influxdata/telegraf/plugins/parsers"
"github.com/influxdata/telegraf/testutil"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
const (
testMsg = "cpu_load_short,host=server01 value=12.0 1422568543702900257\n"
testMsgs = `
cpu_load_short,host=server02 value=12.0 1422568543702900257
cpu_load_short,host=server03 value=12.0 1422568543702900257
cpu_load_short,host=server04 value=12.0 1422568543702900257
cpu_load_short,host=server05 value=12.0 1422568543702900257
cpu_load_short,host=server06 value=12.0 1422568543702900257
`
)
func newTestUdpListener() (*UdpListener, chan []byte) {
in := make(chan []byte, 1500)
listener := &UdpListener{
ServiceAddress: ":8125",
UDPPacketSize: 1500,
AllowedPendingMessages: 10000,
in: in,
done: make(chan struct{}),
@@ -22,6 +38,72 @@ func newTestUdpListener() (*UdpListener, chan []byte) {
return listener, in
}
func TestHighTrafficUDP(t *testing.T) {
listener := UdpListener{
ServiceAddress: ":8126",
AllowedPendingMessages: 100000,
}
listener.parser, _ = parsers.NewInfluxParser()
acc := &testutil.Accumulator{}
// send multiple messages to socket
err := listener.Start(acc)
require.NoError(t, err)
time.Sleep(time.Millisecond * 25)
conn, err := net.Dial("udp", "127.0.0.1:8126")
require.NoError(t, err)
for i := 0; i < 20000; i++ {
// arbitrary, just to give the OS buffer some slack handling the
// packet storm.
time.Sleep(time.Microsecond)
fmt.Fprintf(conn, testMsgs)
}
time.Sleep(time.Millisecond)
listener.Stop()
// this is not an exact science, since UDP packets can easily get lost or
// dropped, but assume that the OS will be able to
// handle at least 90% of the sent UDP packets.
assert.InDelta(t, 100000, len(acc.Metrics), 10000)
}
func TestConnectUDP(t *testing.T) {
listener := UdpListener{
ServiceAddress: ":8127",
AllowedPendingMessages: 10000,
}
listener.parser, _ = parsers.NewInfluxParser()
acc := &testutil.Accumulator{}
require.NoError(t, listener.Start(acc))
defer listener.Stop()
time.Sleep(time.Millisecond * 25)
conn, err := net.Dial("udp", "127.0.0.1:8127")
require.NoError(t, err)
// send single message to socket
fmt.Fprintf(conn, testMsg)
time.Sleep(time.Millisecond * 15)
acc.AssertContainsTaggedFields(t, "cpu_load_short",
map[string]interface{}{"value": float64(12)},
map[string]string{"host": "server01"},
)
// send multiple messages to socket
fmt.Fprintf(conn, testMsgs)
time.Sleep(time.Millisecond * 15)
hostTags := []string{"server02", "server03",
"server04", "server05", "server06"}
for _, hostTag := range hostTags {
acc.AssertContainsTaggedFields(t, "cpu_load_short",
map[string]interface{}{"value": float64(12)},
map[string]string{"host": hostTag},
)
}
}
func TestRunParser(t *testing.T) {
log.SetOutput(ioutil.Discard)
var testmsg = []byte("cpu_load_short,host=server01 value=12.0 1422568543702900257")

View File

@@ -107,7 +107,8 @@ type item struct {
counterHandle win.PDH_HCOUNTER
}
var sanitizedChars = strings.NewReplacer("/sec", "_persec", "/Sec", "_persec", " ", "_")
var sanitizedChars = strings.NewReplacer("/sec", "_persec", "/Sec", "_persec",
" ", "_", "%", "Percent", `\`, "")
func (m *Win_PerfCounters) AddItem(metrics *itemList, query string, objectName string, counter string, instance string,
measurement string, include_total bool) {
@@ -271,6 +272,9 @@ func (m *Win_PerfCounters) Gather(acc telegraf.Accumulator) error {
&bufCount, &emptyBuf[0]) // uses null ptr here according to MSDN.
if ret == win.PDH_MORE_DATA {
filledBuf := make([]win.PDH_FMT_COUNTERVALUE_ITEM_DOUBLE, bufCount*size)
if len(filledBuf) == 0 {
continue
}
ret = win.PdhGetFormattedCounterArrayDouble(metric.counterHandle,
&bufSize, &bufCount, &filledBuf[0])
for i := 0; i < int(bufCount); i++ {
@@ -299,13 +303,12 @@ func (m *Win_PerfCounters) Gather(acc telegraf.Accumulator) error {
tags["instance"] = s
}
tags["objectname"] = metric.objectName
fields[sanitizedChars.Replace(string(metric.counter))] = float32(c.FmtValue.DoubleValue)
fields[sanitizedChars.Replace(metric.counter)] =
float32(c.FmtValue.DoubleValue)
var measurement string
if metric.measurement == "" {
measurement := sanitizedChars.Replace(metric.measurement)
if measurement == "" {
measurement = "win_perf_counters"
} else {
measurement = metric.measurement
}
acc.AddFields(measurement, fields, tags)
}

View File

@@ -27,40 +27,39 @@ echo mntr | nc localhost 2181
zk_max_file_descriptor_count 1024 - only available on Unix platforms
```
## Measurements:
#### Zookeeper measurements:
## Configuration
Meta:
- units: int64
- tags: `server=<hostname> port=<port> state=<leader|follower>`
```
# Reads 'mntr' stats from one or many zookeeper servers
[[inputs.zookeeper]]
## An array of address to gather stats about. Specify an ip or hostname
## with port. ie localhost:2181, 10.0.0.1:2181, etc.
Measurement names:
- zookeeper_avg_latency
- zookeeper_max_latency
- zookeeper_min_latency
- zookeeper_packets_received
- zookeeper_packets_sent
- zookeeper_outstanding_requests
- zookeeper_znode_count
- zookeeper_watch_count
- zookeeper_ephemerals_count
- zookeeper_approximate_data_size
- zookeeper_followers #only exposed by the Leader
- zookeeper_synced_followers #only exposed by the Leader
- zookeeper_pending_syncs #only exposed by the Leader
- zookeeper_open_file_descriptor_count
- zookeeper_max_file_descriptor_count
## If no servers are specified, then localhost is used as the host.
## If no port is specified, 2181 is used
servers = [":2181"]
```
#### Zookeeper string measurements:
## InfluxDB Measurement:
Meta:
- units: string
- tags: `server=<hostname> port=<port> state=<leader|follower>`
Measurement names:
- zookeeper_version
### Tags:
- All measurements have the following tags:
-
```
M zookeeper
T host
T port
T state
F approximate_data_size integer
F avg_latency integer
F ephemerals_count integer
F max_file_descriptor_count integer
F max_latency integer
F min_latency integer
F num_alive_connections integer
F open_file_descriptor_count integer
F outstanding_requests integer
F packets_received integer
F packets_sent integer
F version string
F watch_count integer
F znode_count integer
```

View File

@@ -2,6 +2,42 @@
This plugin writes to [InfluxDB](https://www.influxdb.com) via HTTP or UDP.
### Configuration:
```toml
# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
## The full HTTP or UDP endpoint URL for your InfluxDB instance.
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval.
# urls = ["udp://localhost:8089"] # UDP endpoint example
urls = ["http://localhost:8086"] # required
## The target database for metrics (telegraf will create it if not exists).
database = "telegraf" # required
## Retention policy to write to. Empty string writes to the default rp.
retention_policy = ""
## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
write_consistency = "any"
## Write timeout (for the InfluxDB client), formatted as a string.
## If not provided, will default to 5s. 0s means no timeout (not recommended).
timeout = "5s"
# username = "telegraf"
# password = "metricsmetricsmetricsmetrics"
## Set the user agent for HTTP POSTs (can be useful for log differentiation)
# user_agent = "telegraf"
## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
# udp_payload = 512
## Optional SSL Config
# ssl_ca = "/etc/telegraf/ca.pem"
# ssl_cert = "/etc/telegraf/cert.pem"
# ssl_key = "/etc/telegraf/key.pem"
## Use SSL but skip chain & host verification
# insecure_skip_verify = false
```
### Required parameters:
* `urls`: List of strings, this is for InfluxDB clustering
@@ -12,16 +48,14 @@ to write to. Each URL should start with either `http://` or `udp://`
### Optional parameters:
* `write_consistency`: Write consistency (clusters only), can be: "any", "one", "quorum", "all".
* `retention_policy`: Retention policy to write to.
* `precision`: Precision of writes, valid values are "ns", "us" (or "µs"), "ms", "s", "m", "h". note: using "s" precision greatly improves InfluxDB compression.
* `timeout`: Write timeout (for the InfluxDB client), formatted as a string. If not provided, will default to 5s. 0s means no timeout (not recommended).
* `username`: Username for influxdb
* `password`: Password for influxdb
* `user_agent`: Set the user agent for HTTP POSTs (can be useful for log differentiation)
* `udp_payload`: Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
## Optional SSL Config
* `ssl_ca`: SSL CA
* `ssl_cert`: SSL CERT
* `ssl_key`: SSL key
* `insecure_skip_verify`: Use SSL but skip chain & host verification (default: false)
* `write_consistency`: Write consistency for clusters only, can be: "any", "one", "quorom", "all"

View File

@@ -55,7 +55,7 @@ var sampleConfig = `
## Retention policy to write to. Empty string writes to the default rp.
retention_policy = ""
## Write consistency (clusters only), can be: "any", "one", "quorom", "all"
## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
write_consistency = "any"
## Write timeout (for the InfluxDB client), formatted as a string.
@@ -146,7 +146,7 @@ func (i *InfluxDB) Connect() error {
func createDatabase(c client.Client, database string) error {
// Create Database if it doesn't exist
_, err := c.Query(client.Query{
Command: fmt.Sprintf("CREATE DATABASE IF NOT EXISTS \"%s\"", database),
Command: fmt.Sprintf("CREATE DATABASE \"%s\"", database),
})
return err
}

View File

@@ -0,0 +1,67 @@
# Kafka Producer Output Plugin
This plugin writes to a [Kafka Broker](http://kafka.apache.org/07/quickstart.html) acting a Kafka Producer.
```
[[outputs.kafka]]
## URLs of kafka brokers
brokers = ["localhost:9092"]
## Kafka topic for producer messages
topic = "telegraf"
## Telegraf tag to use as a routing key
## ie, if this tag exists, it's value will be used as the routing key
routing_tag = "host"
## CompressionCodec represents the various compression codecs recognized by
## Kafka in messages.
## 0 : No compression
## 1 : Gzip compression
## 2 : Snappy compression
compression_codec = 0
## RequiredAcks is used in Produce Requests to tell the broker how many
## replica acknowledgements it must see before responding
## 0 : the producer never waits for an acknowledgement from the broker.
## This option provides the lowest latency but the weakest durability
## guarantees (some data will be lost when a server fails).
## 1 : the producer gets an acknowledgement after the leader replica has
## received the data. This option provides better durability as the
## client waits until the server acknowledges the request as successful
## (only messages that were written to the now-dead leader but not yet
## replicated will be lost).
## -1: the producer gets an acknowledgement after all in-sync replicas have
## received the data. This option provides the best durability, we
## guarantee that no messages will be lost as long as at least one in
## sync replica remains.
required_acks = -1
## The total number of times to retry sending a message
max_retry = 3
## Optional SSL Config
# ssl_ca = "/etc/telegraf/ca.pem"
# ssl_cert = "/etc/telegraf/cert.pem"
# ssl_key = "/etc/telegraf/key.pem"
## Use SSL but skip chain & host verification
# insecure_skip_verify = false
data_format = "influx"
```
### Required parameters:
* `brokers`: List of strings, this is for speaking to a cluster of `kafka` brokers. On each flush interval, Telegraf will randomly choose one of the urls to write to. Each URL should just include host and port e.g. -> `["{host}:{port}","{host2}:{port2}"]`
* `topic`: The `kafka` topic to publish to.
### Optional parameters:
* `routing_tag`: if this tag exists, it's value will be used as the routing key
* `compression_codec`: What level of compression to use: `0` -> no compression, `1` -> gzip compression, `2` -> snappy compression
* `required_acks`: a setting for how may `acks` required from the `kafka` broker cluster.
* `max_retry`: Max number of times to retry failed write
* `ssl_ca`: SSL CA
* `ssl_cert`: SSL CERT
* `ssl_key`: SSL key
* `insecure_skip_verify`: Use SSL but skip chain & host verification (default: false)
* `data_format`: [About Telegraf data formats](https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md)

View File

@@ -7,6 +7,7 @@ import (
"io/ioutil"
"log"
"net/http"
"regexp"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
@@ -14,19 +15,22 @@ import (
"github.com/influxdata/telegraf/plugins/serializers/graphite"
)
// Librato structure for configuration and client
type Librato struct {
ApiUser string
ApiToken string
Debug bool
NameFromTags bool
SourceTag string
Timeout internal.Duration
Template string
APIUser string
APIToken string
Debug bool
SourceTag string // Deprecated, keeping for backward-compatibility
Timeout internal.Duration
Template string
apiUrl string
APIUrl string
client *http.Client
}
// https://www.librato.com/docs/kb/faq/best_practices/naming_convention_metrics_sources.html#naming-limitations-for-sources-and-metrics
var reUnacceptedChar = regexp.MustCompile("[^.a-zA-Z0-9_-]")
var sampleConfig = `
## Librator API Docs
## http://dev.librato.com/v1/metrics-authentication
@@ -36,20 +40,21 @@ var sampleConfig = `
api_token = "my-secret-token" # required.
## Debug
# debug = false
## Tag Field to populate source attribute (optional)
## This is typically the _hostname_ from which the metric was obtained.
source_tag = "host"
## Connection timeout.
# timeout = "5s"
## Output Name Template (same as graphite buckets)
## Output source Template (same as graphite buckets)
## see https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md#graphite
template = "host.tags.measurement.field"
## This template is used in librato's source (not metric's name)
template = "host"
`
// LMetrics is the default struct for Librato's API fromat
type LMetrics struct {
Gauges []*Gauge `json:"gauges"`
}
// Gauge is the gauge format for Librato's API fromat
type Gauge struct {
Name string `json:"name"`
Value float64 `json:"value"`
@@ -57,17 +62,22 @@ type Gauge struct {
MeasureTime int64 `json:"measure_time"`
}
const librato_api = "https://metrics-api.librato.com/v1/metrics"
const libratoAPI = "https://metrics-api.librato.com/v1/metrics"
func NewLibrato(apiUrl string) *Librato {
// NewLibrato is the main constructor for librato output plugins
func NewLibrato(apiURL string) *Librato {
return &Librato{
apiUrl: apiUrl,
APIUrl: apiURL,
Template: "host",
}
}
// Connect is the default output plugin connection function who make sure it
// can connect to the endpoint
func (l *Librato) Connect() error {
if l.ApiUser == "" || l.ApiToken == "" {
return fmt.Errorf("api_user and api_token are required fields for librato output")
if l.APIUser == "" || l.APIToken == "" {
return fmt.Errorf(
"api_user and api_token are required fields for librato output")
}
l.client = &http.Client{
Timeout: l.Timeout.Duration,
@@ -76,18 +86,23 @@ func (l *Librato) Connect() error {
}
func (l *Librato) Write(metrics []telegraf.Metric) error {
if len(metrics) == 0 {
return nil
}
lmetrics := LMetrics{}
if l.Template == "" {
l.Template = "host"
}
if l.SourceTag != "" {
l.Template = l.SourceTag
}
tempGauges := []*Gauge{}
metricCounter := 0
for _, m := range metrics {
if gauges, err := l.buildGauges(m); err == nil {
for _, gauge := range gauges {
tempGauges = append(tempGauges, gauge)
metricCounter++
if l.Debug {
log.Printf("[DEBUG] Got a gauge: %v\n", gauge)
}
@@ -100,81 +115,115 @@ func (l *Librato) Write(metrics []telegraf.Metric) error {
}
}
lmetrics.Gauges = make([]*Gauge, metricCounter)
copy(lmetrics.Gauges, tempGauges[0:])
metricsBytes, err := json.Marshal(lmetrics)
if err != nil {
return fmt.Errorf("unable to marshal Metrics, %s\n", err.Error())
} else {
metricCounter := len(tempGauges)
// make sur we send a batch of maximum 300
sizeBatch := 300
for start := 0; start < metricCounter; start += sizeBatch {
lmetrics := LMetrics{}
end := start + sizeBatch
if end > metricCounter {
end = metricCounter
sizeBatch = end - start
}
lmetrics.Gauges = make([]*Gauge, sizeBatch)
copy(lmetrics.Gauges, tempGauges[start:end])
metricsBytes, err := json.Marshal(lmetrics)
if err != nil {
return fmt.Errorf("unable to marshal Metrics, %s\n", err.Error())
}
if l.Debug {
log.Printf("[DEBUG] Librato request: %v\n", string(metricsBytes))
}
}
req, err := http.NewRequest("POST", l.apiUrl, bytes.NewBuffer(metricsBytes))
if err != nil {
return fmt.Errorf("unable to create http.Request, %s\n", err.Error())
}
req.Header.Add("Content-Type", "application/json")
req.SetBasicAuth(l.ApiUser, l.ApiToken)
resp, err := l.client.Do(req)
if err != nil {
if l.Debug {
log.Printf("[DEBUG] Error POSTing metrics: %v\n", err.Error())
req, err := http.NewRequest(
"POST",
l.APIUrl,
bytes.NewBuffer(metricsBytes))
if err != nil {
return fmt.Errorf(
"unable to create http.Request, %s\n",
err.Error())
}
return fmt.Errorf("error POSTing metrics, %s\n", err.Error())
} else {
if l.Debug {
req.Header.Add("Content-Type", "application/json")
req.SetBasicAuth(l.APIUser, l.APIToken)
resp, err := l.client.Do(req)
if err != nil {
if l.Debug {
log.Printf("[DEBUG] Error POSTing metrics: %v\n", err.Error())
}
return fmt.Errorf("error POSTing metrics, %s\n", err.Error())
}
defer resp.Body.Close()
if resp.StatusCode != 200 || l.Debug {
htmlData, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Printf("[DEBUG] Couldn't get response! (%v)\n", err)
} else {
}
if resp.StatusCode != 200 {
return fmt.Errorf(
"received bad status code, %d\n %s",
resp.StatusCode,
string(htmlData))
}
if l.Debug {
log.Printf("[DEBUG] Librato response: %v\n", string(htmlData))
}
}
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
return fmt.Errorf("received bad status code, %d\n", resp.StatusCode)
}
return nil
}
// SampleConfig is function who return the default configuration for this
// output
func (l *Librato) SampleConfig() string {
return sampleConfig
}
// Description is function who return the Description of this output
func (l *Librato) Description() string {
return "Configuration for Librato API to send metrics to."
}
func (l *Librato) buildGauges(m telegraf.Metric) ([]*Gauge, error) {
gauges := []*Gauge{}
bucket := graphite.SerializeBucketName(m.Name(), m.Tags(), l.Template, "")
if m.Time().Unix() == 0 {
return gauges, fmt.Errorf(
"Measure time must not be zero\n <%s> \n",
m.String())
}
metricSource := graphite.InsertField(
graphite.SerializeBucketName("", m.Tags(), l.Template, ""),
"value")
if metricSource == "" {
return gauges,
fmt.Errorf("undeterminable Source type from Field, %s\n",
l.Template)
}
for fieldName, value := range m.Fields() {
metricName := m.Name()
if fieldName != "value" {
metricName = fmt.Sprintf("%s.%s", m.Name(), fieldName)
}
gauge := &Gauge{
Name: graphite.InsertField(bucket, fieldName),
Source: reUnacceptedChar.ReplaceAllString(metricSource, "-"),
Name: reUnacceptedChar.ReplaceAllString(metricName, "-"),
MeasureTime: m.Time().Unix(),
}
if !gauge.verifyValue(value) {
if !verifyValue(value) {
continue
}
if err := gauge.setValue(value); err != nil {
return gauges, fmt.Errorf("unable to extract value from Fields, %s\n",
return gauges, fmt.Errorf(
"unable to extract value from Fields, %s\n",
err.Error())
}
if l.SourceTag != "" {
if source, ok := m.Tags()[l.SourceTag]; ok {
gauge.Source = source
} else {
return gauges,
fmt.Errorf("undeterminable Source type from Field, %s\n",
l.SourceTag)
}
}
gauges = append(gauges, gauge)
}
if l.Debug {
@@ -183,7 +232,7 @@ func (l *Librato) buildGauges(m telegraf.Metric) ([]*Gauge, error) {
return gauges, nil
}
func (g *Gauge) verifyValue(v interface{}) bool {
func verifyValue(v interface{}) bool {
switch v.(type) {
case string:
return false
@@ -209,12 +258,13 @@ func (g *Gauge) setValue(v interface{}) error {
return nil
}
//Close is used to close the connection to librato Output
func (l *Librato) Close() error {
return nil
}
func init() {
outputs.Add("librato", func() telegraf.Output {
return NewLibrato(librato_api)
return NewLibrato(libratoAPI)
})
}

View File

@@ -1,7 +1,6 @@
package librato
import (
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
@@ -10,141 +9,137 @@ import (
"time"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/serializers/graphite"
"github.com/influxdata/telegraf/testutil"
"github.com/stretchr/testify/require"
)
var (
fakeUrl = "http://test.librato.com"
fakeURL = "http://test.librato.com"
fakeUser = "telegraf@influxdb.com"
fakeToken = "123456"
)
func fakeLibrato() *Librato {
l := NewLibrato(fakeUrl)
l.ApiUser = fakeUser
l.ApiToken = fakeToken
l := NewLibrato(fakeURL)
l.APIUser = fakeUser
l.APIToken = fakeToken
return l
}
func BuildTags(t *testing.T) {
testMetric := testutil.TestMetric(0.0, "test1")
graphiteSerializer := graphite.GraphiteSerializer{}
tags, err := graphiteSerializer.Serialize(testMetric)
fmt.Printf("Tags: %v", tags)
require.NoError(t, err)
}
func TestUriOverride(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
ts := httptest.NewServer(
http.HandlerFunc(
func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
defer ts.Close()
l := NewLibrato(ts.URL)
l.ApiUser = "telegraf@influxdb.com"
l.ApiToken = "123456"
l.APIUser = "telegraf@influxdb.com"
l.APIToken = "123456"
err := l.Connect()
require.NoError(t, err)
err = l.Write(testutil.MockMetrics())
err = l.Write([]telegraf.Metric{newHostMetric(int32(0), "name", "host")})
require.NoError(t, err)
}
func TestBadStatusCode(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
json.NewEncoder(w).Encode(`{
"errors": {
"system": [
"The API is currently down for maintenance. It'll be back shortly."
]
}
}`)
}))
ts := httptest.NewServer(
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
}))
defer ts.Close()
l := NewLibrato(ts.URL)
l.ApiUser = "telegraf@influxdb.com"
l.ApiToken = "123456"
l.APIUser = "telegraf@influxdb.com"
l.APIToken = "123456"
err := l.Connect()
require.NoError(t, err)
err = l.Write(testutil.MockMetrics())
err = l.Write([]telegraf.Metric{newHostMetric(int32(0), "name", "host")})
if err == nil {
t.Errorf("error expected but none returned")
} else {
require.EqualError(t, fmt.Errorf("received bad status code, 503\n"), err.Error())
require.EqualError(
t,
fmt.Errorf("received bad status code, 503\n "), err.Error())
}
}
func TestBuildGauge(t *testing.T) {
mtime := time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix()
var gaugeTests = []struct {
ptIn telegraf.Metric
outGauge *Gauge
err error
}{
{
testutil.TestMetric(0.0, "test1"),
newHostMetric(0.0, "test1", "host1"),
&Gauge{
Name: "value1.test1",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test1",
MeasureTime: mtime,
Value: 0.0,
Source: "host1",
},
nil,
},
{
testutil.TestMetric(1.0, "test2"),
newHostMetric(1.0, "test2", "host2"),
&Gauge{
Name: "value1.test2",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test2",
MeasureTime: mtime,
Value: 1.0,
Source: "host2",
},
nil,
},
{
testutil.TestMetric(10, "test3"),
newHostMetric(10, "test3", "host3"),
&Gauge{
Name: "value1.test3",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test3",
MeasureTime: mtime,
Value: 10.0,
Source: "host3",
},
nil,
},
{
testutil.TestMetric(int32(112345), "test4"),
newHostMetric(int32(112345), "test4", "host4"),
&Gauge{
Name: "value1.test4",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test4",
MeasureTime: mtime,
Value: 112345.0,
Source: "host4",
},
nil,
},
{
testutil.TestMetric(int64(112345), "test5"),
newHostMetric(int64(112345), "test5", "host5"),
&Gauge{
Name: "value1.test5",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test5",
MeasureTime: mtime,
Value: 112345.0,
Source: "host5",
},
nil,
},
{
testutil.TestMetric(float32(11234.5), "test6"),
newHostMetric(float32(11234.5), "test6", "host6"),
&Gauge{
Name: "value1.test6",
MeasureTime: time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test6",
MeasureTime: mtime,
Value: 11234.5,
Source: "host6",
},
nil,
},
{
testutil.TestMetric("11234.5", "test7"),
newHostMetric("11234.5", "test7", "host7"),
nil,
nil,
},
}
l := NewLibrato(fakeUrl)
l := NewLibrato(fakeURL)
for _, gt := range gaugeTests {
gauges, err := l.buildGauges(gt.ptIn)
if err != nil && gt.err == nil {
@@ -167,61 +162,121 @@ func TestBuildGauge(t *testing.T) {
}
}
func newHostMetric(value interface{}, name, host string) (metric telegraf.Metric) {
metric, _ = telegraf.NewMetric(
name,
map[string]string{"host": host},
map[string]interface{}{"value": value},
time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC),
)
return
}
func TestBuildGaugeWithSource(t *testing.T) {
mtime := time.Date(2009, time.November, 10, 23, 0, 0, 0, time.UTC)
pt1, _ := telegraf.NewMetric(
"test1",
map[string]string{"hostname": "192.168.0.1", "tag1": "value1"},
map[string]interface{}{"value": 0.0},
time.Date(2010, time.November, 10, 23, 0, 0, 0, time.UTC),
mtime,
)
pt2, _ := telegraf.NewMetric(
"test2",
map[string]string{"hostnam": "192.168.0.1", "tag1": "value1"},
map[string]interface{}{"value": 1.0},
time.Date(2010, time.December, 10, 23, 0, 0, 0, time.UTC),
mtime,
)
pt3, _ := telegraf.NewMetric(
"test3",
map[string]string{
"hostname": "192.168.0.1",
"tag2": "value2",
"tag1": "value1"},
map[string]interface{}{"value": 1.0},
mtime,
)
pt4, _ := telegraf.NewMetric(
"test4",
map[string]string{
"hostname": "192.168.0.1",
"tag2": "value2",
"tag1": "value1"},
map[string]interface{}{"value": 1.0},
mtime,
)
var gaugeTests = []struct {
ptIn telegraf.Metric
template string
outGauge *Gauge
err error
}{
{
pt1,
"hostname",
&Gauge{
Name: "192_168_0_1.value1.test1",
MeasureTime: time.Date(2010, time.November, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test1",
MeasureTime: mtime.Unix(),
Value: 0.0,
Source: "192.168.0.1",
Source: "192_168_0_1",
},
nil,
},
{
pt2,
"hostname",
&Gauge{
Name: "192_168_0_1.value1.test1",
MeasureTime: time.Date(2010, time.December, 10, 23, 0, 0, 0, time.UTC).Unix(),
Name: "test2",
MeasureTime: mtime.Unix(),
Value: 1.0,
},
fmt.Errorf("undeterminable Source type from Field, hostname"),
},
{
pt3,
"tags",
&Gauge{
Name: "test3",
MeasureTime: mtime.Unix(),
Value: 1.0,
Source: "192_168_0_1.value1.value2",
},
nil,
},
{
pt4,
"hostname.tag2",
&Gauge{
Name: "test4",
MeasureTime: mtime.Unix(),
Value: 1.0,
Source: "192_168_0_1.value2",
},
nil,
},
}
l := NewLibrato(fakeUrl)
l.SourceTag = "hostname"
l := NewLibrato(fakeURL)
for _, gt := range gaugeTests {
l.Template = gt.template
gauges, err := l.buildGauges(gt.ptIn)
if err != nil && gt.err == nil {
t.Errorf("%s: unexpected error, %+v\n", gt.ptIn.Name(), err)
}
if gt.err != nil && err == nil {
t.Errorf("%s: expected an error (%s) but none returned", gt.ptIn.Name(), gt.err.Error())
t.Errorf(
"%s: expected an error (%s) but none returned",
gt.ptIn.Name(),
gt.err.Error())
}
if len(gauges) == 0 {
continue
}
if gt.err == nil && !reflect.DeepEqual(gauges[0], gt.outGauge) {
t.Errorf("%s: \nexpected %+v\ngot %+v\n", gt.ptIn.Name(), gt.outGauge, gauges[0])
t.Errorf(
"%s: \nexpected %+v\ngot %+v\n",
gt.ptIn.Name(),
gt.outGauge, gauges[0])
}
}
}

Some files were not shown because too many files have changed in this diff Show More