telegraf/plugins/inputs/dcos
Daniel Nelson 2ce21bff24 Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
..
README.md Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
client.go Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
client_test.go Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
creds.go Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
dcos.go Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00
dcos_test.go Add input plugin for DC/OS (#3519) 2017-11-29 11:50:32 -08:00

README.md

DC/OS Input Plugin

This input plugin gathers metrics from a DC/OS cluster's metrics component.

Series Cardinality Warning

Depending on the work load of your DC/OS cluster, this plugin can quickly create a high number of series which, when unchecked, can cause high load on your database.

  • Use measurement filtering liberally to exclude unneeded metrics as well as the node, container, and app inclue/exclude options.
  • Write to a database with an appropriate retention policy.
  • Limit the number of series allowed in your database using the max-series-per-database and max-values-per-tag settings.
  • Consider enabling the TSI engine.
  • Monitor your series cardinality.

Configuration:

[[inputs.dcos]]
  ## The DC/OS cluster URL.
  cluster_url = "https://dcos-master-1"

  ## The ID of the service account.
  service_account_id = "telegraf"
  ## The private key file for the service account.
  service_account_private_key = "/etc/telegraf/telegraf-sa-key.pem"

  ## Path containing login token.  If set, will read on every gather.
  # token_file = "/home/dcos/.dcos/token"

  ## In all filter options if both include and exclude are empty all items
  ## will be collected.  Arrays may contain glob patterns.
  ##
  ## Node IDs to collect metrics from.  If a node is excluded, no metrics will
  ## be collected for its containers or apps.
  # node_include = []
  # node_exclude = []
  ## Container IDs to collect container metrics from.
  # container_include = []
  # container_exclude = []
  ## Container IDs to collect app metrics from.
  # app_include = []
  # app_exclude = []

  ## Maximum concurrent connections to the cluster.
  # max_connections = 10
  ## Maximum time to receive a response from cluster.
  # response_timeout = "20s"

  ## Optional SSL Config
  # ssl_ca = "/etc/telegraf/ca.pem"
  # ssl_cert = "/etc/telegraf/cert.pem"
  # ssl_key = "/etc/telegraf/key.pem"
  ## If false, skip chain & host verification
  # insecure_skip_verify = true

  ## Recommended filtering to reduce series cardinality.
  # [inputs.dcos.tagdrop]
  #   path = ["/var/lib/mesos/slave/slaves/*"]

Enterprise Authentication

When using Enterprise DC/OS, it is recommended to use a service account to authenticate with the cluster.

The plugin requires the following permissions:

dcos:adminrouter:ops:system-metrics full
dcos:adminrouter:ops:mesos full

Follow the directions to create a service account and assign permissions.

Quick configuration using the Enterprise CLI:

dcos security org service-accounts keypair telegraf-sa-key.pem telegraf-sa-cert.pem
dcos security org service-accounts create -p telegraf-sa-cert.pem -d "Telegraf DC/OS input plugin" telegraf
dcos security org users grant telegraf dcos:adminrouter:ops:system-metrics full
dcos security org users grant telegraf dcos:adminrouter:ops:mesos full

Open Source Authentication

The Open Source DC/OS does not provide service accounts. Instead you can use of the following options:

  1. Disable authentication
  2. Use the token_file parameter to read a authentication token from a file.

Then token_file can be set by using the [dcos cli] to login periodically. The cli can login for at most XXX days, you will need to ensure the cli performs a new login before this time expires.

dcos auth login --username foo --password bar
dcos config show core.dcos_acs_token > ~/.dcos/token

Another option to create a token_file is to generate a token using the cluster secret. This will allow you to set the expiration date manually or even create a never expiring token. However, if the cluster secret or the token is compromised it cannot be revoked and may require a full reinstall of the cluster. For more information on this technique reference this blog post.

Metrics:

Please consult the Metrics Reference for details on interprete field interpretation.

  • dcos_node

    • tags:
      • cluster
      • hostname
      • path (filesystem fields only)
      • interface (network fields only)
    • fields:
      • system_uptime (float)
      • cpu_cores (float)
      • cpu_total (float)
      • cpu_user (float)
      • cpu_system (float)
      • cpu_idle (float)
      • cpu_wait (float)
      • load_1min (float)
      • load_5min (float)
      • load_15min (float)
      • filesystem_capacity_total_bytes (int)
      • filesystem_capacity_used_bytes (int)
      • filesystem_capacity_free_bytes (int)
      • filesystem_inode_total (float)
      • filesystem_inode_used (float)
      • filesystem_inode_free (float)
      • memory_total_bytes (int)
      • memory_free_bytes (int)
      • memory_buffers_bytes (int)
      • memory_cached_bytes (int)
      • swap_total_bytes (int)
      • swap_free_bytes (int)
      • swap_used_bytes (int)
      • network_in_bytes (int)
      • network_out_bytes (int)
      • network_in_packets (float)
      • network_out_packets (float)
      • network_in_dropped (float)
      • network_out_dropped (float)
      • network_in_errors (float)
      • network_out_errors (float)
      • process_count (float)
  • dcos_container

    • tags:
      • cluster
      • hostname
      • container_id
      • task_name
    • fields:
      • cpus_limit (float)
      • cpus_system_time (float)
      • cpus_throttled_time (float)
      • cpus_user_time (float)
      • disk_limit_bytes (int)
      • disk_used_bytes (int)
      • mem_limit_bytes (int)
      • mem_total_bytes (int)
      • net_rx_bytes (int)
      • net_rx_dropped (float)
      • net_rx_errors (float)
      • net_rx_packets (float)
      • net_tx_bytes (int)
      • net_tx_dropped (float)
      • net_tx_errors (float)
      • net_tx_packets (float)
  • dcos_app

    • tags:
      • cluster
      • hostname
      • container_id
      • task_name
    • fields:
      • fields are application specific

Example Output:

dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/boot filesystem_capacity_free_bytes=918188032i,filesystem_capacity_total_bytes=1063256064i,filesystem_capacity_used_bytes=145068032i,filesystem_inode_free=523958,filesystem_inode_total=524288,filesystem_inode_used=330 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=dummy0 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=docker0 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18 cpu_cores=2,cpu_idle=81.62,cpu_system=4.19,cpu_total=13.670000000000002,cpu_user=9.48,cpu_wait=0,load_15min=0.7,load_1min=0.22,load_5min=0.6,memory_buffers_bytes=970752i,memory_cached_bytes=1830473728i,memory_free_bytes=1178636288i,memory_total_bytes=3975073792i,process_count=198,swap_free_bytes=859828224i,swap_total_bytes=859828224i,swap_used_bytes=0i,system_uptime=18874 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=lo network_in_bytes=1090992450i,network_in_dropped=0,network_in_errors=0,network_in_packets=1546938,network_out_bytes=1090992450i,network_out_dropped=0,network_out_errors=0,network_out_packets=1546938 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/ filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=minuteman network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=210i,network_out_dropped=0,network_out_errors=0,network_out_packets=3 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=eth0 network_in_bytes=539886216i,network_in_dropped=1,network_in_errors=0,network_in_packets=979808,network_out_bytes=112395836i,network_out_dropped=0,network_out_errors=0,network_out_packets=891239 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=spartan network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=210i,network_out_dropped=0,network_out_errors=0,network_out_packets=3 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/var/lib/docker/overlay filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=vtep1024 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/var/lib/docker/plugins filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=d-dcos network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000
dcos_app,cluster=enterprise,container_id=9a78d34a-3bbf-467e-81cf-a57737f154ee,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000
dcos_container,cluster=enterprise,container_id=cbf19b77-3b8d-4bcf-b81f-824b67279629,hostname=192.168.122.18 cpus_limit=0.3,cpus_system_time=307.31,cpus_throttled_time=102.029930607,cpus_user_time=268.57,disk_limit_bytes=268435456i,disk_used_bytes=30953472i,mem_limit_bytes=570425344i,mem_total_bytes=13316096i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000
dcos_app,cluster=enterprise,container_id=cbf19b77-3b8d-4bcf-b81f-824b67279629,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000
dcos_container,cluster=enterprise,container_id=5725e219-f66e-40a8-b3ab-519d85f4c4dc,hostname=192.168.122.18,task_name=hello-world cpus_limit=0.6,cpus_system_time=25.6,cpus_throttled_time=327.977109217,cpus_user_time=566.54,disk_limit_bytes=0i,disk_used_bytes=0i,mem_limit_bytes=1107296256i,mem_total_bytes=335941632i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000
dcos_app,cluster=enterprise,container_id=5725e219-f66e-40a8-b3ab-519d85f4c4dc,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000
dcos_app,cluster=enterprise,container_id=c76e1488-4fb7-4010-a4cf-25725f8173f9,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000
dcos_container,cluster=enterprise,container_id=cbe0b2f9-061f-44ac-8f15-4844229e8231,hostname=192.168.122.18,task_name=telegraf cpus_limit=0.2,cpus_system_time=8.109999999,cpus_throttled_time=93.183916045,cpus_user_time=17.97,disk_limit_bytes=0i,disk_used_bytes=0i,mem_limit_bytes=167772160i,mem_total_bytes=0i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000
dcos_container,cluster=enterprise,container_id=b64115de-3d2a-431d-a805-76e7c46453f1,hostname=192.168.122.18 cpus_limit=0.2,cpus_system_time=2.69,cpus_throttled_time=20.064861214,cpus_user_time=6.56,disk_limit_bytes=268435456i,disk_used_bytes=29360128i,mem_limit_bytes=297795584i,mem_total_bytes=13733888i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000
dcos_app,cluster=enterprise,container_id=b64115de-3d2a-431d-a805-76e7c46453f1,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000