telegraf/plugins/aggregators/histogram/README.md

4.9 KiB

Histogram Aggregator Plugin

Goal

This plugin was added for ability to build histograms.

Description

The histogram aggregator plugin aggregates values of specified metric's fields. The metric is emitted every period seconds. All you need to do is to specify borders of histogram buckets and fields, for which you want to aggregate histogram.

How it works

The each metric is passed to the aggregator and this aggregator searches histogram buckets for those fields, which have been specified in the config. If buckets are found, the aggregator will put +1 to appropriate bucket. Otherwise, nothing will happen. Every period seconds these data will be pushed to output.

Note, that the all hits of current bucket will be also added to all next buckets in final result of distribution. Why does it work this way? In configuration you define right borders for each bucket in a ascending sequence. Internally buckets are presented as ranges with borders (0..bucketBorder]: 0..1, 0..10, 0..50, …, 0..+Inf. So the value "+1" will be put into those buckets, in which the metric value fell with such ranges of buckets.

This plugin creates cumulative histograms. It means, that the hits in the buckets will always increase from the moment of telegraf start. But if you restart telegraf, all hits in the buckets will be reset to 0.

Also, the algorithm of hit counting to buckets was implemented on the base of the algorithm, which is implemented in the Prometheus client.

Configuration

# Configuration for aggregate histogram metrics
[[aggregators.histogram]]
  ## General Aggregator Arguments:
  ## The period on which to flush & clear the aggregator.
  period = "30s"
  ## If true, the original metric will be dropped by the
  ## aggregator and will not get sent to the output plugins.
  drop_original = false

  ## The example of config to aggregate histogram for all fields of specified metric.
  [[aggregators.histogram.config]]
    ## The set of buckets.
    buckets = [0.0, 15.6, 34.5, 49.1, 71.5, 80.5, 94.5, 100.0]
    ## The name of metric.
    metric_name = "cpu"

  ## The example of config to aggregate histogram for concrete fields of specified metric.
  [[aggregators.histogram.config]]
    ## The set of buckets.
    buckets = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
    ## The name of metric.
    metric_name = "diskio"
    ## The concrete fields of metric.
    metric_fields = ["io_time", "read_time", "write_time"]

Explanation

The field metric_fields is the list of metric fields. For example, the metric cpu has the following fields: usage_user, usage_system, usage_idle, usage_nice, usage_iowait, usage_irq, usage_softirq, usage_steal, usage_guest, usage_guest_nice.

Note that histogram metrics will be pushed every period seconds. As you know telegraf calls aggregator Reset() func each period seconds. Histogram aggregator ignores Reset() and continues to count hits.

Use cases

You can specify fields using two cases:

  1. The specifying only metric name. In this case all fields of metric will be aggregated.
  2. The specifying metric name and concrete field.

Some rules

  • The setting of each histogram must be in separate section with title aggregators.histogram.config.

  • The each value of bucket must be float value.

  • Don`t include the border bucket +Inf. It will be done automatically.

Measurements & Fields:

The postfix bucket will be added to each field.

  • measurement1
    • field1_bucket
    • field2_bucket

Tags:

All measurements have tag le. This tag has the border value of bucket. It means that the metric value is less or equal to the value of this tag. For example, let assume that we have the metric value 10 and the following buckets: [5, 10, 30, 70, 100]. Then the tag le will have the value 10, because the metrics value is passed into bucket with right border value 10.

Example Output:

The following output will return to the Prometheus client.

cpu,cpu=cpu1,host=localhost,le=0.0 usage_idle_bucket=0i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=10.0 usage_idle_bucket=0i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=20.0 usage_idle_bucket=1i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=30.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=40.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=50.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=60.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=70.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=80.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=90.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=100.0 usage_idle_bucket=2i 1486998330000000000
cpu,cpu=cpu1,host=localhost,le=+Inf usage_idle_bucket=2i 1486998330000000000