telegraf/plugins/inputs/zipkin/README.md

# Zipkin Plugin

This plugin implements the Zipkin http server to gather trace and timing data needed to troubleshoot latency problems in microservice architectures.

*Please Note: This plugin is experimental; Its data schema may be subject to change
based on its main usage cases and the evolution of the OpenTracing standard.*

## Configuration:
```toml
[[inputs.zipkin]]
    path = "/api/v1/spans" # URL path for span data
    port = 9411 # Port on which Telegraf listens
```

The plugin accepts spans in `JSON` or `thrift` if the `Content-Type` is `application/json` or `application/x-thrift`, respectively.
If `Content-Type` is not set, then the plugin assumes it is `JSON` format.

## Tracing:

This plugin uses Annotations tags and fields to track data from spans

- __TRACE:__ is a set of spans that share a single root span.
Traces are built by collecting all Spans that share a traceId.

- __SPAN:__ is a set of Annotations and BinaryAnnotations that correspond to a particular RPC.

- __Annotations:__ for each annotation & binary annotation of a span a metric is output. *Records an occurrence in time at the beginning and end of a request.*

  Annotations may have the following values:

    - __CS (client start):__ beginning of span, request is made.
    - __SR (server receive):__ server receives request and will start processing it
      network latency & clock jitters differ it from cs
    - __SS (server send):__ server is done processing and sends request back to client
      amount of time it took to process request will differ it from sr
    - __CR (client receive):__ end of span, client receives response from server
      RPC is considered complete with this annotation

### Tags
* __"id":__               The 64 bit ID of the span.
* __"parent_id":__        An ID associated with a particular child span.  If there is no child span, the parent ID is set to ID.
* __"trace_id":__        The 64 or 128-bit ID of a particular trace. Every span in a trace shares this ID. Concatenation of high and low and converted to hexadecimal.
* __"name":__             Defines a span

##### Annotations have these additional tags:

* __"service_name":__     Defines a service
* __"annotation":__       The value of an annotation
* __"endpoint_host":__    Listening port concat with IPV4, if port is not present it will not be concatenated

##### Binary Annotations have these additional tag:

  * __"service_name":__     Defines a service
  * __"annotation":__       The value of an annotation
  * __"endpoint_host":__    Listening port concat with IPV4, if port is not present it will not be concatenated
  * __"annotation_key":__ label describing the annotation


### Fields:
  * __"duration_ns":__ The time in nanoseconds between the end and beginning of a span.


### Sample Queries:

__Get All Span Names for Service__ `my_web_server`
```sql
SHOW TAG VALUES FROM "zipkin" with key="name" WHERE "service_name" = 'my_web_server'
```
  - __Description:__  returns a list containing the names of the spans which have annotations with the given `service_name` of `my_web_server`.

__Get All Service Names__
```sql
SHOW TAG VALUES FROM "zipkin" WITH KEY = "service_name"
```
  - __Description:__  returns a list of all `distinct` endpoint service names.

__Find spans with longest duration__
```sql
SELECT max("duration_ns") FROM "zipkin" WHERE "service_name" = 'my_service' AND "name" = 'my_span_name' AND time > now() - 20m GROUP BY "trace_id",time(30s) LIMIT 5
```
  - __Description:__  In the last 20 minutes find the top 5 longest span durations for service `my_server` and span name `my_span_name`


### Recommended InfluxDB setup

This test will create high cardinality data so we recommend using the [tsi influxDB engine](https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/).
#### How To Set Up InfluxDB For Work With Zipkin

  ##### Steps
  1. ___Update___ InfluxDB to >= 1.3, in order to use the new tsi engine.

  2. ___Generate___ a config file with the following command:
```sh
influxd config > /path/for/config/file
```
  3. ___Add___ the following to your config file, under the `[data]` tab:
```toml
[data]
  index-version = "tsi1"
```

  4. ___Start___ `influxd` with your new config file:
```sh
influxd -config=/path/to/your/config/file
```

  5. ___Update___ your retention policy:
```sql
ALTER RETENTION POLICY "autogen" ON "telegraf" DURATION 1d SHARD DURATION 30m
```

### Example Input Trace:

- [Cli microservice with two services Test](https://github.com/openzipkin/zipkin-go-opentracing/tree/master/examples/cli_with_2_services)
- [Test data from distributed trace repo sample json](https://github.com/mattkanwisher/distributedtrace/blob/master/testclient/sample.json)
#### [Trace Example from Zipkin model](http://zipkin.io/pages/data_model.html)
```json
{
  "traceId": "bd7a977555f6b982",
  "name": "query",
  "id": "be2d01e33cc78d97",
  "parentId": "ebf33e1a81dc6f71",
  "timestamp": 1458702548786000,
  "duration": 13000,
  "annotations": [
    {
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      },
      "timestamp": 1458702548786000,
      "value": "cs"
    },
    {
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      },
      "timestamp": 1458702548799000,
      "value": "cr"
    }
  ],
  "binaryAnnotations": [
    {
      "key": "jdbc.query",
      "value": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?",
      "endpoint": {
        "serviceName": "zipkin-query",
        "ipv4": "192.168.1.2",
        "port": 9411
      }
    },
    {
      "key": "sa",
      "value": true,
      "endpoint": {
        "serviceName": "spanstore-jdbc",
        "ipv4": "127.0.0.1",
        "port": 3306
      }
    }
  ]
}
```
Add Zipkin input plugin (#3080) 2017-08-03 00:58:26 +00:00			`# Zipkin Plugin`

			`This plugin implements the Zipkin http server to gather trace and timing data needed to troubleshoot latency problems in microservice architectures.`

			`*Please Note: This plugin is experimental; Its data schema may be subject to change`
			`based on its main usage cases and the evolution of the OpenTracing standard.*`

			`## Configuration:`
			```toml
			`[[inputs.zipkin]]`
			`path = "/api/v1/spans" # URL path for span data`
			`port = 9411 # Port on which Telegraf listens`
			```

Add JSON input support to zipkin plugin (#3150) 2017-08-22 00:24:54 +00:00			The plugin accepts spans in `JSON` or `thrift` if the `Content-Type` is `application/json` or `application/x-thrift`, respectively.
			If `Content-Type` is not set, then the plugin assumes it is `JSON` format.

Add Zipkin input plugin (#3080) 2017-08-03 00:58:26 +00:00			`## Tracing:`

			`This plugin uses Annotations tags and fields to track data from spans`

			`- __TRACE:__ is a set of spans that share a single root span.`
			`Traces are built by collecting all Spans that share a traceId.`

			`- __SPAN:__ is a set of Annotations and BinaryAnnotations that correspond to a particular RPC.`

			`- __Annotations:__ for each annotation & binary annotation of a span a metric is output. Records an occurrence in time at the beginning and end of a request.`

			`Annotations may have the following values:`

			`- __CS (client start):__ beginning of span, request is made.`
			`- __SR (server receive):__ server receives request and will start processing it`
			`network latency & clock jitters differ it from cs`
			`- __SS (server send):__ server is done processing and sends request back to client`
			`amount of time it took to process request will differ it from sr`
			`- __CR (client receive):__ end of span, client receives response from server`
			`RPC is considered complete with this annotation`

			`### Tags`
			`* __"id":__ The 64 bit ID of the span.`
			`* __"parent_id":__ An ID associated with a particular child span. If there is no child span, the parent ID is set to ID.`
			`* __"trace_id":__ The 64 or 128-bit ID of a particular trace. Every span in a trace shares this ID. Concatenation of high and low and converted to hexadecimal.`
			`* __"name":__ Defines a span`

			`##### Annotations have these additional tags:`

			`* __"service_name":__ Defines a service`
			`* __"annotation":__ The value of an annotation`
			`* __"endpoint_host":__ Listening port concat with IPV4, if port is not present it will not be concatenated`

			`##### Binary Annotations have these additional tag:`

			`* __"service_name":__ Defines a service`
			`* __"annotation":__ The value of an annotation`
			`* __"endpoint_host":__ Listening port concat with IPV4, if port is not present it will not be concatenated`
			`* __"annotation_key":__ label describing the annotation`


			`### Fields:`
			`* __"duration_ns":__ The time in nanoseconds between the end and beginning of a span.`



			`### Sample Queries:`

			__Get All Span Names for Service__ `my_web_server`
			```sql
			`SHOW TAG VALUES FROM "zipkin" with key="name" WHERE "service_name" = 'my_web_server'`
			```
			- __Description:__ returns a list containing the names of the spans which have annotations with the given `service_name` of `my_web_server`.

			`__Get All Service Names__`
			```sql
			`SHOW TAG VALUES FROM "zipkin" WITH KEY = "service_name"`
			```
			- __Description:__ returns a list of all `distinct` endpoint service names.

			`__Find spans with longest duration__`
			```sql
			`SELECT max("duration_ns") FROM "zipkin" WHERE "service_name" = 'my_service' AND "name" = 'my_span_name' AND time > now() - 20m GROUP BY "trace_id",time(30s) LIMIT 5`
			```
			- __Description:__ In the last 20 minutes find the top 5 longest span durations for service `my_server` and span name `my_span_name`


			`### Recommended InfluxDB setup`

Fix spelling mistakes in zipkin and apache inputs (#3741) 2018-02-01 19:15:12 +00:00			`This test will create high cardinality data so we recommend using the [tsi influxDB engine](https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/).`
Add Zipkin input plugin (#3080) 2017-08-03 00:58:26 +00:00			`#### How To Set Up InfluxDB For Work With Zipkin`

			`##### Steps`
			`1. ___Update___ InfluxDB to >= 1.3, in order to use the new tsi engine.`

			`2. ___Generate___ a config file with the following command:`
			```sh
			`influxd config > /path/for/config/file`
			```
			3. ___Add___ the following to your config file, under the `[data]` tab:
			```toml
			`[data]`
			`index-version = "tsi1"`
			```

			4. ___Start___ `influxd` with your new config file:
			```sh
			`influxd -config=/path/to/your/config/file`
			```

			`5. ___Update___ your retention policy:`
			```sql
			`ALTER RETENTION POLICY "autogen" ON "telegraf" DURATION 1d SHARD DURATION 30m`
			```

			`### Example Input Trace:`

			`- [Cli microservice with two services Test](https://github.com/openzipkin/zipkin-go-opentracing/tree/master/examples/cli_with_2_services)`
			`- [Test data from distributed trace repo sample json](https://github.com/mattkanwisher/distributedtrace/blob/master/testclient/sample.json)`
			`#### [Trace Example from Zipkin model](http://zipkin.io/pages/data_model.html)`
			```json
			`{`
			`"traceId": "bd7a977555f6b982",`
			`"name": "query",`
			`"id": "be2d01e33cc78d97",`
			`"parentId": "ebf33e1a81dc6f71",`
			`"timestamp": 1458702548786000,`
			`"duration": 13000,`
			`"annotations": [`
			`{`
			`"endpoint": {`
			`"serviceName": "zipkin-query",`
			`"ipv4": "192.168.1.2",`
			`"port": 9411`
			`},`
			`"timestamp": 1458702548786000,`
			`"value": "cs"`
			`},`
			`{`
			`"endpoint": {`
			`"serviceName": "zipkin-query",`
			`"ipv4": "192.168.1.2",`
			`"port": 9411`
			`},`
			`"timestamp": 1458702548799000,`
			`"value": "cr"`
			`}`
			`],`
			`"binaryAnnotations": [`
			`{`
			`"key": "jdbc.query",`
			"value": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?",
			`"endpoint": {`
			`"serviceName": "zipkin-query",`
			`"ipv4": "192.168.1.2",`
			`"port": 9411`
			`}`
			`},`
			`{`
			`"key": "sa",`
			`"value": true,`
			`"endpoint": {`
			`"serviceName": "spanstore-jdbc",`
			`"ipv4": "127.0.0.1",`
			`"port": 3306`
			`}`
			`}`
			`]`
			`}`
			```