ganglia配置文件详解

来源：互联网发布：淘宝客服规则编辑：程序博客网时间：2024/06/11 17:53

本文主要介绍了Ganglia 的gmetad和gmond的配置文件

Gmetad

gmetad（Ganglia Meta Daemon）是一种安装在主机上用来收集和汇聚gmond所收集的指标数据的守护进程。gmetad默认使用RRD文件收集和汇聚指标数据，也可以通过配置gmetad将指标数据转送到诸如Graphite的外部系统。
gmetad通过tcp端口8651侦听远程gmetad连接，并且向授权主机提供XML格式网格状态。Gmetad也通过tcp端口8652端口对交互式请求做出应答。gweb使用这种交互式查询方式来表示那些不适合RRD文件（如OS版本）的信息。
gmetad默认将指标数据直接写入文件系统上的RRD文件，rrdcached可以充当gmetad和RRD文件之间的缓存。

配置文件gmetad.conf
gmetad.conf 配置文件是由单行属性及相应值组成。属性名不区分大小写，但属性值区分大小写。
例如下面属性名是等价的：
name Name NaMe NAME
除了少量必须属性外，其他属性均为可选。有些属性可能会多次出现在配置文件中，有些只出现一次。下面详细介绍gmetad.conf 常用属性：

data_source属性

这是gmetad的核心属性配置。每一行data_source描述一个gmetad收集信息的gmond集群或gmetad网格。当gmetad检测到data_source引用一个集群时，gmetad将为data_source保留一整套轮询数据库（Round Robin Database）。然而，当gmetad检测一个data_source引用一个网格时，gmetad将只保留概要性的RRD。
设置scalable属性为off时，gmetad 将强制为网格data_source保留一整套RRD文件，该值默认为on。
下面是默认配置中的data_source配置示例：

data_source "my cluster" 10 localhost my.machine.edu:8649 1.2.3.5:8655
data_source "my grid" 50 1.3.4.7:8655 grid.org:8651 grid-backup.org:8651
data_source "another source" 1.3.4.7:8655 1.3.4.8
每个data_source有3个字段组成。第一个字段为唯一标识data_source的字符串；第二个字段为指定轮询间隔（单位是秒）；第三个字段是以空格分开的所要轮询数据的主机列表，可以以IP地址或主机名的形式指定，同时可以指定端口号。如果未指定端口号，默认连接tcp/8649。

gmetad守护进程相关属性：

gridname（字符串）
能唯一标识网格的字符串。
authority （URL）
网格的授权URL，被其他gmetad用来找到当前gmetad数据源的图表位置。默认值是：http://hostname/ganglia/
trusted_hosts （主机列表）
当前gmetad允许数据共享的主机列表，以空格为分隔。Localhost总是可信的。
all_trusted （on|off）
当设置为on时，将重写trusted_hosts属性，允许数据和任意主机共享。
setuid （on|off）
当设置为off时，将不能设置UID。
setuid_username （UID）
gmetad设置UID的用户名。默认nobody
xml_port （数字）
gmetad的侦听端口，默认是8651
interactive_port （数字）
gmetad的交互式侦听端口，默认8652
server_threads
允许同时连接到侦听端口（8651）的连接数，默认是4
case_sensitive_hostnames
在gmetad之前版本中，RRD文件区分主机名大小写。如果希望继续使用Ganglia3.2 之前版本创建的RRD，将该值设置为1,。共3.2版本后，该值默认是0.

RRDTool相关属性：

RRA （文本格式）
通过该选项可以自定义Round-Robin archives，默认为15秒的步进值。RRAs "RRA:AVERAGE:0.5:1:5856" "RRA:AVERAGE:0.5:4:20160""RRA:AVERAGE:0.5:40:52704"
umask （数字）
设置rrd文件和grid目录的umask值，默认为0，及rrd文件是公开的，任何用户都可以访问的。
rrd_rootdir （路径）
设置RRD文件在本地文件系统存储的目录。默认值为：/var/lib/ganglia/rrds

Graphite相关属性:

设置下列属性，可以将gmetad收集到的指标数据输出到Graphite。
Graphite是一种外部开源指标数据存储和可视化工具。

carbon_server
远程carbon守护进程的主机名或IP
carbon_port
carbon端口号，默认为2003
carbon_protocol
通信协议，默认tcp
graphite_prefix 被参数graphite_path取代
graphite_path
用户自定义的Graphite路径。使用点分隔的路径组织和引用metircs。
carbon_timeout
Number of milliseconds gmetad will wait for a response from the graphite server

Gmond

配置文件gmond.conf
gmond.conf配置文件由大括号括起来的几个 section组成。这些section可以粗略的划分为两个逻辑分类。第一类的section处理主机和集群的配置，第二类中section处理指标数据收集和调度的特定问题。所有的section名和属性不区分大小写。
有些section是可选的，而有些section是必需的。有些section可能会多次出现在配置文件中，有些只出现一次；有些section还可能包含subsection。
配置文件的section包括globals、cluster、host、udp_send_channel、udp_recv_channel、tcp_accept_channel、sflow、modules、collection_group，下面详细介绍。

globals

这一section配置守护进程本身的通用特性，在配置文件中只出现一次。下面是ganglia3.7.2中的默认globals配置：

globals {
daemonize = yes
setuid = yes
user = root
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
allow_extra_data = yes
host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 day */
host_tmax = 20 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
# By default gmond will use reverse DNS resolution when displaying your hostname
# Uncommeting following value will override that value.
# override_hostname = "mywebserver.domain.com"
# If you are not using multicast this value should be set to something other than 0.
# Otherwise if you restart aggregator gmond you will get empty graphs. 60 seconds is reasonable
send_metadata_interval = 15 /*secs */
}

cluster

每个gmond守护进程会使用在cluster section中定义的属性来报告它所属集群的信息。下面为默认配置：

/*
* The cluster attributes specified will be used as part of the <CLUSTER>
* tag that will wrap all hosts collected by this instance.
*/
cluster {
name = "unspecified"
owner = "nobody"
latlong = "unspecified"
url = "unspecified"
}

host

host section提供运行gmond主机的相关信息。

/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}

location：用来标识主机位置

udp channels

gmond集群内每个节点默认通过UDP将自身指标数据多播至其他节点，同时侦听其他节点的类似UDP多播。
UDP通道是通过udp_send_channel和udp_recv_channel 两个section创建的。下面为udp_send_channe配置信息：

/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
# mcast_join = 239.2.11.71
host = 192.168.22.100
port = 8649
ttl = 1
}

下面为udp_recv_channe配置信息：

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8649
#bind = 239.2.11.71
retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}

tcp_accept_channel

TCP接收通道是gmond节点创建向gmetad或其他外部轮询器汇报集群状态的通道。用户可以配置任意多选项。默认配置如下：

/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8649
# If you want to gzip XML output
gzip_output = no
}

sflow

sflow是用于检测高速路由网络的工业标准技术。sflow section是可选的。sflow聚合器最初定位为嵌入式网络硬件，现在服务于通用操作系统和诸如Tomcat、memcached和Apache http Server等流行应用。gmond可以通过配置来充当网络中sflow代理的聚合器，收集sflow代理的数据并实现对gmetad的透明传输。下面是默认配置：

/* Optional sFlow settings */
#sflow {
# udp_port = 6343
# accept_vm_metrics = yes
# accept_jvm_metrics = yes
# multiple_jvm_instances = no
# accept_http_metrics = yes
# multiple_http_instances = no
# accept_memcache_metrics = yes
# multiple_memcache_instances = no
#}

udp_port (数字)

gmond接收sflow数据的端口

modules

该section包含了加载模块的必要参数。指标模块是动态可加载的共享目标文件，用于扩展gmond可收集的指标。
每个modules section必须包含至少一个module subsection。module subsection由5个属性组成。默认配置包含了默认安装中所有可用模块，如果不添加新模块则不需要修改此section。
下面是默认配置：

modules {
module {
name = "core_metrics"
}
module {
name = "cpu_module"
path = "modcpu.so"
}
module {
name = "disk_module"
path = "moddisk.so"
}
module {
name = "load_module"
path = "modload.so"
}
module {
name = "mem_module"
path = "modmem.so"
}
module {
name = "net_module"
path = "modnet.so"
}
module {
name = "proc_module"
path = "modproc.so"
}
module {
name = "sys_module"
path = "modsys.so"
}
}

下面给出一个配置示例：

modules {
module {
name = "example_module"
language = "C/C++"
enabled = yes
path = "modeample.so"
params = "An extra raw parameter"
param RandomMax {
value = 75
}
param ConstantValue {
value = 25
}
}
}

collection_group

collection_group实体指定了gmond包含的指标及gmond收集和广播这些指标的周期。用户可以定义任意多的collection_group，每个collection_group至少包含一种metric section。
下面给出配置文件中部分默认配置：

/* This collection group will send general info about this host*/
collection_group {
collect_every = 60
time_threshold = 60
metric {
name = "cpu_num"
title = "CPU Count"
}
metric {
name = "cpu_speed"
title = "CPU Speed"
}
metric {
name = "mem_total"
title = "Memory Total"
}
metric {
name = "swap_total"
title = "Swap Space Total"
}
metric {
name = "boottime"
title = "Last Boot Time"
}
metric {
name = "machine_type"
title = "Machine Type"
}
metric {
name = "os_name"
title = "Operating System"
}
metric {
name = "os_release"
title = "Operating System Release"
}
metric {
name = "location"
title = "Location"
}
}

/* This collection group will send the status of gexecd for this host
every 300 secs.*/
/* Unlike 2.5.x the default behavior is to report gexecd OFF. */
collection_group {
collect_once = yes
time_threshold = 300
metric {
name = "gexec"
title = "Gexec Status"
}
}

参考来源：

http://www.cnblogs.com/ixdba/p/3981840.html

书籍《ganglia系统监控》第二章 Ganglia的安装和配置

阅读全文

0 0