* Procstat: don't cache PIDs
Changed the procstat input plugin to not cache PIDs. Solves #1636.
The logic of creating a process by pid was moved from `procstat.go` to
`spec_processor.go`.
* Procstat: go fmt
* procstat: modify changelog for #2206
* ceph: maps are already refs, no need to use a pointer
* ceph: pgmap_states are represented in a single metric "count", differenciated by tag
* Update CHANGELOG
* Add in support for looking for substring in response
* Add note to CHANGELOG.md
* Switch from substring match to regex match
* Requested code changes
* Make requested changes and refactor to avoid nested if-else.
* Convert tabs to space and compile regex once
* Make Logparser Plugin Check For New Files
Check in the Gather metric to see if any new files matching the glob
have appeared. If so, start tailing them from the beginning.
* changelog update for #2141
This changes the current use of the InfluxDB client to instead use a
baked-in client that uses the fasthttp library.
This allows for significantly smaller allocations, the re-use of http
body buffers, and the re-use of the actual bytes of the line-protocol
metric representations.
We were having problems with telegraf talking to
carbon-relay-ng using the graphite output. When
the carbon-relay-ng server restarted the connection
the telegraf side would go into CLOSE_WAIT but telegraf
would continue to send statistics through the connection.
Reading around it seems you need to a read from the connection
and see a EOF error. We've implemented this and added a test
that replicates roughly the error we were having.
Pair: @whpearson @joshmyers
If we write a batch of points and get a "field type conflict" error
message in return, we should drop the entire batch of points because
this indicates that one or more points have a type that doesnt match the
database.
These errors will never go away on their own, and InfluxDB will
successfully write the points that dont have a conflict.
closes#2245
* Added GatherUserStatistics, row Uptime in gatherGlobalStatuses, and version fields & tags
* Updated README file
* pulling in latest from master
* ran go fmt to fix formatting
* fix unreachable code
* few fixes
* cleaning up and applying suggestions from sparrc
I don't like this behavior, but it's what InfluxDB accepts, so the
telegraf listener should be consistent with that.
I accidentally reverted this behavior when I refactored the telegraf
metric representation earlier in this release cycle.
* Fix for broken librato output
These errors are delightful, but I'd rather avoid them:
```
Error parsing /etc/telegraf/telegraf.conf, line 2: field corresponding to `api_user' is not defined in `*librato.Librato'
```
* Fixed bad format from last commit
this basically reverts #887
at some point we might want to do some special handling of reloading
plugins and keeping their state intact, but that will need to be done at
a higher level, and in a way that is thread-safe for multiple input
plugins of the same type.
Unfortunately this is a rather large feature that will not have a quick
fix available for it.
fixes#1975fixes#2102
* plugins/input/consul: moved check_id from regular fields to tags.
When service has more than one check sending data for both would overwrite each other
resulting only in one check being written (the last one). Adding check_id as a tag
ensures we will get info for all unique checks per service.
* plugins/inputs/consul: updated tests
* fixed parsing of docker image name/version
now accounts for custom docker repo's which contain a colon for a non-default port
* 1978: modifying docker test case to have a custom repo with non-standard port
* using a temp var to store index, ran gofmt
* fixes#1987, renaming iterator to 'i'
* MongoDB input plugin: Improve state data
Adds ARB as a "member_status" (replica set arbiter).
Uses MongoDB replica set state string for "state" value.
* MongoDB input plugin: Improve state data - changelog update
put Makefile back to normal
removed comment from puppetagent.go
changed config_version to config_version_string and fixed yaml for build
changed workind from branch to environment for config_string
fixed casing and Changelog
fixed test case
closes#1917
* Fix bug: too many cloudwatch metrics
Cloudwatch metrics were being added incorrectly. The most obvious
symptom of this was that too many metrics were being added. A simple
check against the name of the metric proved to be a sufficient fix. In
order to test the fix, a metric selection function was factored out.
* Go fmt cloudwatch
* Cloudwatch isSelected checks metric name
* Move cloudwatch line in changelog to 1.2 features
* return partition stat alongside disk stat from disk usage method, and report device name (minus /dev/) as a tag in disk stats
* update system/disk tests to include new partition stat return value from disk usage method calls
* update changelog for #1807 (use device name instead of path to report disk stats)
main reasons behind this:
- make adding/removing tags cheap
- make adding/removing fields cheap
- make parsing cheaper
- make parse -> decorate -> write out bytes metric flow much faster
Refactor serializer to use byte buffer
The old gonuts fork has no License and has not seen any commits
differing from the original project, while the original has seen some
activity, even if low.
Having no license is a problem for distributors, as by default, such
projects are undistributable.
* Trim null characters in Value data format
Some producers (such as the paho embedded c mqtt client) add a null
character "\x00" to the end of a message. The Value parser would fail on
any message from such a producer.
* Trim whitespace and null in all Value data formats
* No unnecessary reassignments in Value data format parser
* Update change log for Value data format fix
* added connection Timeout parámeter, basic HTTP autentication and HTTP support with Sslskipverify option
* updated README.md
* added optional SSL config , changed timeout name and type , and other minor fixes
* added some code style improvements
* Update README.md
* NATS output plug-in now retries to reconnect forever after a lost connection.
* NATS input plug-in now retries to reconnect forever after a lost connection.
* Fixes#1953
in this commit:
- chunks out the http request body to avoid making very large
allocations.
- establishes a limit for the maximum http request body size that the
listener will accept.
- utilizes a pool of byte buffers to reduce GC pressure.
The MySQL DB driver has it's own DSN parsing function. Previously we
were using the url.Parse function, but this causes problems because a
valid MySQL DSN can be an invalid http URL, namely when using some
special characters in the password.
This change uses the MySQL DB driver's builtin ParseDSN function and
applies a timeout parameter natively via that.
Another benefit of this change is that we fail earlier if given an
invalid MySQL DSN.
closes#870closes#1842
Previously, the graphite parser would simply overwrite any template that
had an identical filter to a previous template. This included the empty
filter.
Now we will still overwrite, but first we will sort to make sure that
the most "specific" template always matches.
closes#1731
Map holding expected results was defined in multiple places, making test
cases a bit hard to read. This way we can change our expectations of
good results in one place and have them affect multiple test cases.
in this commit:
- centralize logging output handler.
- set global Info/Debug/Error log levels based on config file or flags.
- remove per-plugin debug arg handling.
- add a I!, D!, or E! to every log message.
- add configuration option to specify where to send logs.
closes#1786
Due to quite real problem of generating vast number of data series through
mesos tasks metrics this feature is disabled until better solution is found.
Also consolidated the translation code to obtain all info with just 1 command execution.
Also split test command mocks out to their own file for cleanliness.
The default is 0 so we hit a division by 0 error and crash. This checks
ensure we will not crash and `log` and continue to let telegraf run
Also we set default allow pending message number to 10000
* Allow numeric and non-string values for tag_keys.
According to the go documentation the JSON deserializer only produces these
base types in output:
- string
- bool
- float64
- nil
With this patch bool, float64 and nil values get converted to a string when
their field key is specified in tag_keys. Previously the field was simply
discarded.
* Updated handling of nil for passing tests.
The automated tests are less than trivial to reproduece locally for me,
so I hope CircleCI wonn't mind...
* Updated changelog entries with PR and issue links.
* separate hello and authenticate functions, force connection close at end of write cycle so we don't hold open idle connections, which has the benefit of mostly removing the chance of getting hopelessly connection lost
* update changelog, though this will need to be updated again to merge into telegraf master
* bump instrumental agent version
* fix test to deal with better better connect/reconnect logic and changed ident & auth handshake
* Update CHANGELOG.md
correct URL from instrumental fork to origin and put the change in the correct part of the file
* go fmt
* Split out Instrumental tests for invalid metric and value.
* Ensure nothing remains on the wire after final test.
* Force valid metric names by replacing invalid parts with underscores.
* Multiple invalid characters being joined into a single udnerscore.
* Adjust comment to what happens.
* undo split hello and auth commands, to reduce roundtrips
* Add ignored_databases option to postgresql configuration files, to enable easy filtering of system databases without needing to whitelist all the databases on the server. Add tests for database whitelist and blacklist.
* run go fmt on new postgresql database whitelist/blacklist code
* add postgresql database blacklist option to changelog
* remove a bad merge from the changelog
also remove locking around adding metrics. Instead, keep a waitgroup on
the ServeHTTP function and wait for that to finish before returning from
the Stop() function
closes#1407
fix incredibly stupid bugs
populate README
support query endpoint and change default listen port
set response headers for query endpoint
add unit tests
revert erroneous Godeps change
add plugin ref to top-level README
remove debug output and add empty post body test
fix linter errors
move stoppableListener into repo
use constants for http status codes
add CHANGELOG entry
address code review comments re. style/structure
address further code review comments
add note to README re. database creation calls per PR comments
started working on this with the idea of fixing #1623, although I
realized that this was actually just a documentation issue around
a toml eccentricity.
closes#1623
And use them in the prometheus output plugin.
Still need to test the prometheus output plugin.
Also need to actually create typed metrics in the system plugins.
closes#1683
closes#1542
Generalize event.
Add doc.
Update default config.
Add filestack to the list of plugins.
Check that video conversion event returns 400.
Update the readme.
Update the changelog.
The iptables plugin aims at monitoring bytes and packet counters
matching a given set of iptables rules.
Typically the user would set a dedicated monitoring chain into a given
iptables table, and add the rules to monitor to this chain. The plugin
will allow to focus on the counters for this particular table/chain.
closes#1471
Added the option removecr to inputs.exec to remove all carraige returns
(CR, ASCII 0x0D, Unicode codepoint \u0D, ^M). The option is boolean and
not enabled if not present in the config file.
closes#1606
Updated CHANGELOG.md with information about removecr
Ran go fmt ./...
Moved removal of CRs to internal/internal.go
Moved the code to remove carriage returns from
plugins/inputs/exec/exec.go to internal/internal.go. Additionally
changed the conditional on which it gets applied from using a
configuration file option to checking if it is running on Windows.
Moved Carriage Return check to correct place
Moved the carriage return removal back to the exec plugin. Added unit
testing for it. Fixed a bug (removing too many characters).
Ran go fmt ./...
Reverted CHANGELOG to master
Updated Changelog
* Move CloudWatch rate limit to config
Reference #1670
* make that variable a string
* ahem, apparently limiter wants an int
* add the ratelimit to the sample config
* update the test to include the rate
* set a default value of 10 for ratelimit
* Move default ratelimit to init
closes#1539
First version of http put working
Refactored code to separate http handling from opentsdb module. Added batching support.
Fixed tag cleaning in http output and refactored telnet output.
Removed useless struct.
Fixed current unittest and added a new one.
Added benchmark test to test json serialization. Made sure http client would reuse connection.
Ran go fmt on opentsdb sources.
Updated README file
Removed useHttp in favor of parsing host string to determine the right API to use for sending metrics. Also renamed BatchSize to HttpBatchSize to better convey that it is only used when using Http API.
Updated changelog
Fixed format issues.
Removed TagSet type to make it more explicit.
Fixed unittest after removing TagSet type.
Revert "Updated changelog"
This reverts commit 24dba5520008d876b5a8d266c34a53e8805cc5f5.
Added PR under 1.1 release.
add missing redis metrics
This makes sure that all redis metrics are present without having to use a hard-coded list of what metrics to pull in.
The existing ceph input plugin only has access to the local admin daemon socket
on the local host, and as such has access to a limited subset of data. This
extends the plugin to use CLI commands to get access to the full spread of Ceph
data. This patch collects global OSD map and IO statistics, PG state and per pool
IO and utilization statistics.
closes#1513
* Some improvment in mesos input plugin,
Removing uneeded statistics prefix for task's metric,
Adding framework id tags into each task's metric,
Adding state (leader/follower) tags to master's metric,
Make sure the slave's metrics are tags with slave
* typo, replacing cpus_total with elected to determine leader
* Remove remaining statistics_ from sample
* using timestamp from mesos as metric timestamp
* change mesos-tasks to mesos_tasks, measurement
* change measurement name in test
* Replace follower by standby
* separate hello and authenticate functions, force connection close at end of write cycle so we don't hold open idle connections, which has the benefit of mostly removing the chance of getting hopelessly connection lost
* update changelog, though this will need to be updated again to merge into telegraf master
* bump instrumental agent version
* fix test to deal with better better connect/reconnect logic and changed ident & auth handshake
* Update CHANGELOG.md
correct URL from instrumental fork to origin and put the change in the correct part of the file
* go fmt
* Split out Instrumental tests for invalid metric and value.
* Ensure nothing remains on the wire after final test.
* Force valid metric names by replacing invalid parts with underscores.
* Multiple invalid characters being joined into a single udnerscore.
* Adjust comment to what happens.
* undo split hello and auth commands, to reduce roundtrips
* Split out Instrumental tests for invalid metric and value.
* Ensure nothing remains on the wire after final test.
* Force valid metric names by replacing invalid parts with underscores.
* Multiple invalid characters being joined into a single udnerscore.
* add an entry to CHANGELOG for easy merging upstream
* go fmt variable alignment
* remove some bugfixes from changelog which now more properly are in a different section.
* remove headers and whitespace should should have been removed with the last commit
* Source improvement for librato output
Build the source from the list of tag instead of a configuration specified
single tag
Graphite Serializer:
* make buildTags public
* make sure not to use empty tags
Librato output:
* Improve Error handling for librato API base on error or debug flag
* Send Metric per Batch (max 300)
* use Graphite BuildTag function to generate source
The change is made that it should be retro compatible
Metric sample:
server=127.0.0.1 port=80 state=leader env=test
measurement.metric_name value
service_n.metric_x
Metric before with source tags set as "server":
source=127.0.0.1
test.80.127_0_0_1.leader.measurement.metric_name
test.80.127_0_0_1.leader.service_n.metric_x
Metric now:
source=test.80.127.0.0.1.leader
measurement.metric_name
service_n.metric_x
As you can see the source in the "new" version is much more precise
That way when filter (only from source) you can filter by env or any other tags
* Using template to specify which tagsusing for source, default concat all
tags
* revert change in graphite serializer
* better documentation, change default for template
* fmt
* test passing with new host as default tags
* use host tag in api integration test
* Limit 80 char per line, change resolution to be a int in the sample
* fmt
* remove resolution, doc for template
* fmt
* Fix problem with metrics when ping return Destination net unreachable
Add test case TestUnreachablePingGather
Add percent_reply_loss
Fix some other tests
* Add errors measurment
* fir problem with ping reply "TTL expired in transit" ( use regex for more specific condition - TTL in line but it's a not valid replay )
add test case for "TTL expired in transit" - TestTTLExpiredPingGather