330 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			330 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			Markdown
		
	
	
	
| # ZFS plugin
 | ||
| 
 | ||
| This ZFS plugin provides metrics from your ZFS filesystems. It supports ZFS on
 | ||
| Linux and FreeBSD. It gets ZFS stat from `/proc/spl/kstat/zfs` on Linux and
 | ||
| from `sysctl` and `zpool` on FreeBSD.
 | ||
| 
 | ||
| ### Configuration:
 | ||
| 
 | ||
| ```toml
 | ||
| [[inputs.zfs]]
 | ||
|   ## ZFS kstat path. Ignored on FreeBSD
 | ||
|   ## If not specified, then default is:
 | ||
|   # kstatPath = "/proc/spl/kstat/zfs"
 | ||
| 
 | ||
|   ## By default, telegraf gather all zfs stats
 | ||
|   ## Override the stats list using the kstatMetrics array:
 | ||
|   ## For FreeBSD, the default is:
 | ||
|   # kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
 | ||
|   ## For Linux, the default is:
 | ||
|   # kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats",
 | ||
|   #     "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]
 | ||
| 
 | ||
|   ## By default, don't gather zpool stats
 | ||
|   # poolMetrics = false
 | ||
| ```
 | ||
| 
 | ||
| ### Measurements & Fields:
 | ||
| 
 | ||
| By default this plugin collects metrics about ZFS internals and pool.
 | ||
| These metrics are either counters or measure sizes
 | ||
| in bytes. These metrics will be in the `zfs` measurement with the field
 | ||
| names listed bellow.
 | ||
| 
 | ||
| If `poolMetrics` is enabled then additional metrics will be gathered for
 | ||
| each pool.
 | ||
| 
 | ||
| - zfs
 | ||
|     With fields listed bellow.
 | ||
| 
 | ||
| #### ARC Stats (FreeBSD and Linux)
 | ||
| 
 | ||
| - arcstats_allocated (FreeBSD only)
 | ||
| - arcstats_anon_evict_data (Linux only)
 | ||
| - arcstats_anon_evict_metadata (Linux only)
 | ||
| - arcstats_anon_evictable_data (FreeBSD only)
 | ||
| - arcstats_anon_evictable_metadata (FreeBSD only)
 | ||
| - arcstats_anon_size
 | ||
| - arcstats_arc_loaned_bytes (Linux only)
 | ||
| - arcstats_arc_meta_limit
 | ||
| - arcstats_arc_meta_max
 | ||
| - arcstats_arc_meta_min (FreeBSD only)
 | ||
| - arcstats_arc_meta_used
 | ||
| - arcstats_arc_no_grow (Linux only)
 | ||
| - arcstats_arc_prune (Linux only)
 | ||
| - arcstats_arc_tempreserve (Linux only)
 | ||
| - arcstats_c
 | ||
| - arcstats_c_max
 | ||
| - arcstats_c_min
 | ||
| - arcstats_data_size
 | ||
| - arcstats_deleted
 | ||
| - arcstats_demand_data_hits
 | ||
| - arcstats_demand_data_misses
 | ||
| - arcstats_demand_hit_predictive_prefetch (FreeBSD only)
 | ||
| - arcstats_demand_metadata_hits
 | ||
| - arcstats_demand_metadata_misses
 | ||
| - arcstats_duplicate_buffers
 | ||
| - arcstats_duplicate_buffers_size
 | ||
| - arcstats_duplicate_reads
 | ||
| - arcstats_evict_l2_cached
 | ||
| - arcstats_evict_l2_eligible
 | ||
| - arcstats_evict_l2_ineligible
 | ||
| - arcstats_evict_l2_skip (FreeBSD only)
 | ||
| - arcstats_evict_not_enough (FreeBSD only)
 | ||
| - arcstats_evict_skip
 | ||
| - arcstats_hash_chain_max
 | ||
| - arcstats_hash_chains
 | ||
| - arcstats_hash_collisions
 | ||
| - arcstats_hash_elements
 | ||
| - arcstats_hash_elements_max
 | ||
| - arcstats_hdr_size
 | ||
| - arcstats_hits
 | ||
| - arcstats_l2_abort_lowmem
 | ||
| - arcstats_l2_asize
 | ||
| - arcstats_l2_cdata_free_on_write
 | ||
| - arcstats_l2_cksum_bad
 | ||
| - arcstats_l2_compress_failures
 | ||
| - arcstats_l2_compress_successes
 | ||
| - arcstats_l2_compress_zeros
 | ||
| - arcstats_l2_evict_l1cached (FreeBSD only)
 | ||
| - arcstats_l2_evict_lock_retry
 | ||
| - arcstats_l2_evict_reading
 | ||
| - arcstats_l2_feeds
 | ||
| - arcstats_l2_free_on_write
 | ||
| - arcstats_l2_hdr_size
 | ||
| - arcstats_l2_hits
 | ||
| - arcstats_l2_io_error
 | ||
| - arcstats_l2_misses
 | ||
| - arcstats_l2_read_bytes
 | ||
| - arcstats_l2_rw_clash
 | ||
| - arcstats_l2_size
 | ||
| - arcstats_l2_write_buffer_bytes_scanned (FreeBSD only)
 | ||
| - arcstats_l2_write_buffer_iter (FreeBSD only)
 | ||
| - arcstats_l2_write_buffer_list_iter (FreeBSD only)
 | ||
| - arcstats_l2_write_buffer_list_null_iter (FreeBSD only)
 | ||
| - arcstats_l2_write_bytes
 | ||
| - arcstats_l2_write_full (FreeBSD only)
 | ||
| - arcstats_l2_write_in_l2 (FreeBSD only)
 | ||
| - arcstats_l2_write_io_in_progress (FreeBSD only)
 | ||
| - arcstats_l2_write_not_cacheable (FreeBSD only)
 | ||
| - arcstats_l2_write_passed_headroom (FreeBSD only)
 | ||
| - arcstats_l2_write_pios (FreeBSD only)
 | ||
| - arcstats_l2_write_spa_mismatch (FreeBSD only)
 | ||
| - arcstats_l2_write_trylock_fail (FreeBSD only)
 | ||
| - arcstats_l2_writes_done
 | ||
| - arcstats_l2_writes_error
 | ||
| - arcstats_l2_writes_hdr_miss (Linux only)
 | ||
| - arcstats_l2_writes_lock_retry (FreeBSD only)
 | ||
| - arcstats_l2_writes_sent
 | ||
| - arcstats_memory_direct_count (Linux only)
 | ||
| - arcstats_memory_indirect_count (Linux only)
 | ||
| - arcstats_memory_throttle_count
 | ||
| - arcstats_meta_size (Linux only)
 | ||
| - arcstats_mfu_evict_data (Linux only)
 | ||
| - arcstats_mfu_evict_metadata (Linux only)
 | ||
| - arcstats_mfu_ghost_evict_data (Linux only)
 | ||
| - arcstats_mfu_ghost_evict_metadata (Linux only)
 | ||
| - arcstats_metadata_size (FreeBSD only)
 | ||
| - arcstats_mfu_evictable_data (FreeBSD only)
 | ||
| - arcstats_mfu_evictable_metadata (FreeBSD only)
 | ||
| - arcstats_mfu_ghost_evictable_data (FreeBSD only)
 | ||
| - arcstats_mfu_ghost_evictable_metadata (FreeBSD only)
 | ||
| - arcstats_mfu_ghost_hits
 | ||
| - arcstats_mfu_ghost_size
 | ||
| - arcstats_mfu_hits
 | ||
| - arcstats_mfu_size
 | ||
| - arcstats_misses
 | ||
| - arcstats_mru_evict_data (Linux only)
 | ||
| - arcstats_mru_evict_metadata (Linux only)
 | ||
| - arcstats_mru_ghost_evict_data (Linux only)
 | ||
| - arcstats_mru_ghost_evict_metadata (Linux only)
 | ||
| - arcstats_mru_evictable_data (FreeBSD only)
 | ||
| - arcstats_mru_evictable_metadata (FreeBSD only)
 | ||
| - arcstats_mru_ghost_evictable_data (FreeBSD only)
 | ||
| - arcstats_mru_ghost_evictable_metadata (FreeBSD only)
 | ||
| - arcstats_mru_ghost_hits
 | ||
| - arcstats_mru_ghost_size
 | ||
| - arcstats_mru_hits
 | ||
| - arcstats_mru_size
 | ||
| - arcstats_mutex_miss
 | ||
| - arcstats_other_size
 | ||
| - arcstats_p
 | ||
| - arcstats_prefetch_data_hits
 | ||
| - arcstats_prefetch_data_misses
 | ||
| - arcstats_prefetch_metadata_hits
 | ||
| - arcstats_prefetch_metadata_misses
 | ||
| - arcstats_recycle_miss (Linux only)
 | ||
| - arcstats_size
 | ||
| - arcstats_sync_wait_for_async (FreeBSD only)
 | ||
| 
 | ||
| #### Zfetch Stats (FreeBSD and Linux)
 | ||
| 
 | ||
| - zfetchstats_bogus_streams (Linux only)
 | ||
| - zfetchstats_colinear_hits (Linux only)
 | ||
| - zfetchstats_colinear_misses (Linux only)
 | ||
| - zfetchstats_hits
 | ||
| - zfetchstats_max_streams (FreeBSD only)
 | ||
| - zfetchstats_misses
 | ||
| - zfetchstats_reclaim_failures (Linux only)
 | ||
| - zfetchstats_reclaim_successes (Linux only)
 | ||
| - zfetchstats_streams_noresets (Linux only)
 | ||
| - zfetchstats_streams_resets (Linux only)
 | ||
| - zfetchstats_stride_hits (Linux only)
 | ||
| - zfetchstats_stride_misses (Linux only)
 | ||
| 
 | ||
| #### Vdev Cache Stats (FreeBSD)
 | ||
| 
 | ||
| - vdev_cache_stats_delegations
 | ||
| - vdev_cache_stats_hits
 | ||
| - vdev_cache_stats_misses
 | ||
| 
 | ||
| #### Pool Metrics (optional)
 | ||
| 
 | ||
| On Linux (reference: kstat accumulated time and queue length statistics):
 | ||
| 
 | ||
| - zfs_pool
 | ||
|     - nread (integer, bytes)
 | ||
|     - nwritten (integer, bytes)
 | ||
|     - reads (integer, count)
 | ||
|     - writes (integer, count)
 | ||
|     - wtime (integer, nanoseconds) 
 | ||
|     - wlentime (integer, queuelength * nanoseconds)
 | ||
|     - wupdate (integer, timestamp)
 | ||
|     - rtime (integer, nanoseconds)
 | ||
|     - rlentime (integer, queuelength * nanoseconds)
 | ||
|     - rupdate (integer, timestamp)
 | ||
|     - wcnt (integer, count)
 | ||
|     - rcnt (integer, count)
 | ||
| 
 | ||
| On FreeBSD:
 | ||
| 
 | ||
| - zfs_pool
 | ||
|     - allocated (integer, bytes)
 | ||
|     - capacity (integer, bytes)
 | ||
|     - dedupratio (float, ratio)
 | ||
|     - free (integer, bytes)
 | ||
|     - size (integer, bytes)
 | ||
|     - fragmentation (integer, percent)
 | ||
| 
 | ||
| ### Tags:
 | ||
| 
 | ||
| - ZFS stats (`zfs`) will have the following tag:
 | ||
|     - pools - A `::` concatenated list of all ZFS pools on the machine.
 | ||
| 
 | ||
| - Pool metrics (`zfs_pool`) will have the following tag:
 | ||
|     - pool - with the name of the pool which the metrics are for.
 | ||
|     - health - the health status of the pool. (FreeBSD only)
 | ||
| 
 | ||
| ### Example Output:
 | ||
| 
 | ||
| ```
 | ||
| $ ./telegraf --config telegraf.conf --input-filter zfs --test
 | ||
| * Plugin: zfs, Collection 1
 | ||
| > zfs_pool,health=ONLINE,pool=zroot allocated=1578590208i,capacity=2i,dedupratio=1,fragmentation=1i,free=64456531968i,size=66035122176i 1464473103625653908
 | ||
| > zfs,pools=zroot arcstats_allocated=4167764i,arcstats_anon_evictable_data=0i,arcstats_anon_evictable_metadata=0i,arcstats_anon_size=16896i,arcstats_arc_meta_limit=10485760i,arcstats_arc_meta_max=115269568i,arcstats_arc_meta_min=8388608i,arcstats_arc_meta_used=51977456i,arcstats_c=16777216i,arcstats_c_max=41943040i,arcstats_c_min=16777216i,arcstats_data_size=0i,arcstats_deleted=1699340i,arcstats_demand_data_hits=14836131i,arcstats_demand_data_misses=2842945i,arcstats_demand_hit_predictive_prefetch=0i,arcstats_demand_metadata_hits=1655006i,arcstats_demand_metadata_misses=830074i,arcstats_duplicate_buffers=0i,arcstats_duplicate_buffers_size=0i,arcstats_duplicate_reads=123i,arcstats_evict_l2_cached=0i,arcstats_evict_l2_eligible=332172623872i,arcstats_evict_l2_ineligible=6168576i,arcstats_evict_l2_skip=0i,arcstats_evict_not_enough=12189444i,arcstats_evict_skip=195190764i,arcstats_hash_chain_max=2i,arcstats_hash_chains=10i,arcstats_hash_collisions=43134i,arcstats_hash_elements=2268i,arcstats_hash_elements_max=6136i,arcstats_hdr_size=565632i,arcstats_hits=16515778i,arcstats_l2_abort_lowmem=0i,arcstats_l2_asize=0i,arcstats_l2_cdata_free_on_write=0i,arcstats_l2_cksum_bad=0i,arcstats_l2_compress_failures=0i,arcstats_l2_compress_successes=0i,arcstats_l2_compress_zeros=0i,arcstats_l2_evict_l1cached=0i,arcstats_l2_evict_lock_retry=0i,arcstats_l2_evict_reading=0i,arcstats_l2_feeds=0i,arcstats_l2_free_on_write=0i,arcstats_l2_hdr_size=0i,arcstats_l2_hits=0i,arcstats_l2_io_error=0i,arcstats_l2_misses=0i,arcstats_l2_read_bytes=0i,arcstats_l2_rw_clash=0i,arcstats_l2_size=0i,arcstats_l2_write_buffer_bytes_scanned=0i,arcstats_l2_write_buffer_iter=0i,arcstats_l2_write_buffer_list_iter=0i,arcstats_l2_write_buffer_list_null_iter=0i,arcstats_l2_write_bytes=0i,arcstats_l2_write_full=0i,arcstats_l2_write_in_l2=0i,arcstats_l2_write_io_in_progress=0i,arcstats_l2_write_not_cacheable=380i,arcstats_l2_write_passed_headroom=0i,arcstats_l2_write_pios=0i,arcstats_l2_write_spa_mismatch=0i,arcstats_l2_write_trylock_fail=0i,arcstats_l2_writes_done=0i,arcstats_l2_writes_error=0i,arcstats_l2_writes_lock_retry=0i,arcstats_l2_writes_sent=0i,arcstats_memory_throttle_count=0i,arcstats_metadata_size=17014784i,arcstats_mfu_evictable_data=0i,arcstats_mfu_evictable_metadata=16384i,arcstats_mfu_ghost_evictable_data=5723648i,arcstats_mfu_ghost_evictable_metadata=10709504i,arcstats_mfu_ghost_hits=1315619i,arcstats_mfu_ghost_size=16433152i,arcstats_mfu_hits=7646611i,arcstats_mfu_size=305152i,arcstats_misses=3676993i,arcstats_mru_evictable_data=0i,arcstats_mru_evictable_metadata=0i,arcstats_mru_ghost_evictable_data=0i,arcstats_mru_ghost_evictable_metadata=80896i,arcstats_mru_ghost_hits=324250i,arcstats_mru_ghost_size=80896i,arcstats_mru_hits=8844526i,arcstats_mru_size=16693248i,arcstats_mutex_miss=354023i,arcstats_other_size=34397040i,arcstats_p=4172800i,arcstats_prefetch_data_hits=0i,arcstats_prefetch_data_misses=0i,arcstats_prefetch_metadata_hits=24641i,arcstats_prefetch_metadata_misses=3974i,arcstats_size=51977456i,arcstats_sync_wait_for_async=0i,vdev_cache_stats_delegations=779i,vdev_cache_stats_hits=323123i,vdev_cache_stats_misses=59929i,zfetchstats_hits=0i,zfetchstats_max_streams=0i,zfetchstats_misses=0i 1464473103634124908
 | ||
| ```
 | ||
| 
 | ||
| ### Description
 | ||
| 
 | ||
| A short description for some of the metrics.
 | ||
| 
 | ||
| #### ARC Stats
 | ||
| 
 | ||
| `arcstats_hits` Total amount of cache hits in the arc.
 | ||
| 
 | ||
| `arcstats_misses` Total amount of cache misses in the arc.
 | ||
| 
 | ||
| `arcstats_demand_data_hits` Amount of cache hits for demand data, this is what matters (is good) for your application/share.
 | ||
| 
 | ||
| `arcstats_demand_data_misses` Amount of cache misses for demand data, this is what matters (is bad) for your application/share.
 | ||
| 
 | ||
| `arcstats_demand_metadata_hits` Amount of cache hits for demand metadata, this matters (is good) for getting filesystem data (ls,find,…)
 | ||
| 
 | ||
| `arcstats_demand_metadata_misses` Amount of cache misses for demand metadata, this matters (is bad) for getting filesystem data (ls,find,…)
 | ||
| 
 | ||
| `arcstats_prefetch_data_hits` The zfs prefetcher tried to prefetch something, but it was already cached (boring)
 | ||
| 
 | ||
| `arcstats_prefetch_data_misses` The zfs prefetcher prefetched something which was not in the cache (good job, could become a demand hit in the future)
 | ||
| 
 | ||
| `arcstats_prefetch_metadata_hits` Same as above, but for metadata
 | ||
| 
 | ||
| `arcstats_prefetch_metadata_misses` Same as above, but for metadata
 | ||
| 
 | ||
| `arcstats_mru_hits` Cache hit in the “most recently used cache”, we move this to the mfu cache.
 | ||
| 
 | ||
| `arcstats_mru_ghost_hits` Cache hit in the “most recently used ghost list” we had this item in the cache, but evicted it, maybe we should increase the mru cache size.
 | ||
| 
 | ||
| `arcstats_mfu_hits` Cache hit in the “most frequently used cache” we move this to the beginning of the mfu cache.
 | ||
| 
 | ||
| `arcstats_mfu_ghost_hits` Cache hit in the “most frequently used ghost list” we had this item in the cache, but evicted it, maybe we should increase the mfu cache size.
 | ||
| 
 | ||
| `arcstats_allocated` New data is written to the cache.
 | ||
| 
 | ||
| `arcstats_deleted` Old data is evicted (deleted) from the cache.
 | ||
| 
 | ||
| `arcstats_evict_l2_cached` We evicted something from the arc, but its still cached in the l2 if we need it.
 | ||
| 
 | ||
| `arcstats_evict_l2_eligible` We evicted something from the arc, and it’s not in the l2 this is sad. (maybe we hadn’t had enough time to store it there)
 | ||
| 
 | ||
| `arcstats_evict_l2_ineligible` We evicted something which cannot be stored in the l2.
 | ||
|  Reasons could be:
 | ||
|  - We have multiple pools, we evicted something from a pool without an l2 device.
 | ||
|  - The zfs property secondary cache.
 | ||
| 
 | ||
| `arcstats_c` Arc target size, this is the size the system thinks the arc should have.
 | ||
| 
 | ||
| `arcstats_size` Total size of the arc.
 | ||
| 
 | ||
| `arcstats_l2_hits` Hits to the L2 cache. (It was not in the arc, but in the l2 cache)
 | ||
| 
 | ||
| `arcstats_l2_misses` Miss to the L2 cache. (It was not in the arc, and not in the l2 cache)
 | ||
| 
 | ||
| `arcstats_l2_size` Size of the l2 cache.
 | ||
| 
 | ||
| `arcstats_l2_hdr_size` Size of the metadata in the arc (ram) used to manage (lookup if something is in the l2) the l2 cache.
 | ||
| 
 | ||
| #### Zfetch Stats
 | ||
| 
 | ||
| `zfetchstats_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher.
 | ||
| 
 | ||
| `zfetchstats_misses` Counts the number of prefetch cache misses.
 | ||
| 
 | ||
| `zfetchstats_colinear_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher (prefetched linear reads)
 | ||
| 
 | ||
| `zfetchstats_stride_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher (prefetched stride reads)
 | ||
| 
 | ||
| #### Vdev Cache Stats (FreeBSD only)
 | ||
| note: the vdev cache is deprecated in some ZFS implementations
 | ||
| 
 | ||
| `vdev_cache_stats_hits` Hits to the vdev (device level) cache.
 | ||
| 
 | ||
| `vdev_cache_stats_misses` Misses to the vdev (device level) cache.
 | ||
| 
 | ||
| #### ABD Stats (Linux Only)
 | ||
| ABD is a linear/scatter dual typed buffer for ARC
 | ||
| 
 | ||
| `abdstats_linear_cnt` number of linear ABDs which are currently allocated
 | ||
| 
 | ||
| `abdstats_linear_data_size` amount of data stored in all linear ABDs
 | ||
| 
 | ||
| `abdstats_scatter_cnt` number of scatter ABDs which are currently allocated
 | ||
| 
 | ||
| `abdstats_scatter_data_size` amount of data stored in all scatter ABDs
 | ||
| 
 | ||
| #### DMU Stats (Linux Only)
 | ||
| 
 | ||
| `dmu_tx_dirty_throttle` counts when writes are throttled due to the amount of dirty data growing too large
 | ||
| 
 | ||
| `dmu_tx_memory_reclaim` counts when memory is low and throttling activity
 | ||
| 
 | ||
| `dmu_tx_memory_reserve` counts when memory footprint of the txg exceeds the ARC size
 | ||
| 
 | ||
| #### Fault Management Ereport errors (Linux Only)
 | ||
| 
 | ||
| `fm_erpt-dropped` counts when an error report cannot be created (eg available memory is too low)
 | ||
| 
 | ||
| #### ZIL (Linux Only)
 | ||
| note: ZIL measurements are system-wide, neither per-pool nor per-dataset
 | ||
| 
 | ||
| `zil_commit_count` counts when ZFS transactions are committed to a ZIL
 |