ceph (luminous 版) zabbix 监控

来源:互联网 发布:unity3d ugui制作血条 编辑:程序博客网 时间:2024/05/21 22:01

目标

ceph (luminous 版) 默认已经自带 zabbix 监控支持配置 zabbix 相应监控

说明

当前使用环境, ceph luminous 版本 ceph-12.2.0-0.el7.x86_64当前 zabbix 监控支持, 需要添加 zabbix 模块监控数据项由 ceph 自身提供, 并通过 trapper 模式向 zabbix server 提交监控数据zabbix 监控针对整个 ceph cluster 整体健康状态只需要在其中一台可以访问 ceph mgr 服务的电脑中启用监控程序即可

信息参考

手动部署 ceph mgr (luminous 版)
trapper 模式参考 zabbix sender 配置方法
zabbix 官方说明
ceph 官方说明

ceph zabbix plugin

强调: 只需要在 ceph 集群中其中一台具有访问 mgr 权限的电脑中执行即可

加载模块

[root@hh-ceph-128040 ~]# ceph mgr module enable zabbix

配置

定义 zabbix server

[root@hh-ceph-128040 ~]# ceph zabbix config-set zabbix_host gx-yun-084044.vclound.comConfiguration option zabbix_host updated

定义当前被监控电脑

[root@hh-ceph-128040 ~]# ceph zabbix config-set identifier hh-ceph-128040.vclound.comConfiguration option identifier updated

定义 zabbix-sender 位置

[root@hh-ceph-128040 ~]# ceph zabbix config-set zabbix_sender /etc/apps/svr/zabbix/bin/zabbix_senderConfiguration option zabbix_sender updated

定义 zabbix server port

[root@hh-ceph-128040 ~]# ceph zabbix config-set zabbix_port 10051Configuration option zabbix_port updated

定义 item 周期时间

[root@hh-ceph-128040 ~]# ceph zabbix config-set interval 60Configuration option interval updated

显示配置

[root@hh-ceph-128040 ~]# ceph zabbix config-show{"zabbix_host": "gx-yun-084044.vclound.com", "identifier": "hh-ceph-128040.vclound.com", "zabbix_sender": "/etc/apps/svr/zabbix/bin/zabbix_sender", "interval": 60, "zabbix_port": 10051}

zabbix server 配置

模板

zabbix_tempalte.xml 位置

[root@hh-ceph-128040 ~]# rpm -ql ceph-mgr | grep xml/usr/lib64/ceph/mgr/zabbix/zabbix_template.xml

可到 github 中直接下载

导入模板

注意, 模板默认对应 zabbix-3.x 版, 假如需要导入到 zabbix-2.x 中, 则需要修改 zabbix_temaplte.xml

<?xml version="1.0" encoding="UTF-8"?>    <zabbix_export>        <version>2.0</version>     <- 修改成 2.0 即可导入

导入模板方法
template

浏览时候选择本地模板文件, 点击 import 即可导入模板

添加主机

new_host

指定主机对应 template

add_template

修改数据库 allow host

为了确保每个 template 中的 trapper 都指定 allowed host, 最直接的方法是修改数据库

参考下面例子

获得 template id

MariaDB [(none)]> use zabbix;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedMariaDB [zabbix]> select hostid from hosts where name='ceph-mgr Zabbix module';+--------+| hostid |+--------+|  10395 |+--------+1 row in set (0.00 sec)

获得对应 item

MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;+--------+-----------------------------------------------+-----------------------------+------+---------------+| itemid | name                                          | key_                        | type | trapper_hosts |+--------+-----------------------------------------------+-----------------------------+------+---------------+|  35793 | Number of Monitors                            | ceph.num_mon                |    2 |               ||  35794 | Number of OSDs                                | ceph.num_osd                |    2 |               ||  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 |               ||  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 |               ||  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 |               ||  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 |               ||  35799 | Number of Pools                               | ceph.num_pools              |    2 |               ||  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 |               ||  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 |               ||  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 |               ||  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 |               ||  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 |               ||  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 |               ||  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 |               ||  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 |               ||  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 |               ||  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 |               ||  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 |               ||  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 |               ||  35812 | Overall Ceph status                           | ceph.overall_status         |    2 |               ||  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 |               ||  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 |               ||  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 |               ||  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 |               ||  35817 | Total bytes                                   | ceph.total_bytes            |    2 |               ||  35818 | Total number of objects                       | ceph.total_objects          |    2 |               ||  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 |               ||  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 |               ||  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 |               |+--------+-----------------------------------------------+-----------------------------+------+---------------+29 rows in set (0.00 sec)

定义 allow host

把之前添加了 ceph zabbix module 的服务器 IP 地址 update 到表中

MariaDB [zabbix]> update items set trapper_hosts='10.199.128.40,10.199.128.214,10.199.128.215' where hostid=10395;Query OK, 29 rows affected (0.00 sec)Rows matched: 29  Changed: 29  Warnings: 0MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+| itemid | name                                          | key_                        | type | trapper_hosts                               |+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+|  35793 | Number of Monitors                            | ceph.num_mon                |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35794 | Number of OSDs                                | ceph.num_osd                |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35799 | Number of Pools                               | ceph.num_pools              |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35812 | Overall Ceph status                           | ceph.overall_status         |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35817 | Total bytes                                   | ceph.total_bytes            |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35818 | Total number of objects                       | ceph.total_objects          |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 ||  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 | 10.199.128.40,10.199.128.214,10.199.128.215 |+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+29 rows in set (0.00 sec)

注, 上面只是一个允许添加多个 trapper allow host 的例子, 实际上只需要添加一台服务器 ip 地址

确认 trapper

参考下图
打开 zabbix 中新添加的 host , 打开其中一个 ceph item, 确认 type = zabbix trapper, allowed hosts = 你 update 数据库中的 ip 地址
trapper

ceph cron job

利用 cron job, 每分钟自动上报一次 ceph 监控数据

[root@hh-ceph-128040 ~]# cat /etc/cron.d/ceph*/1 * * * * root ceph zabbix send

监控 screenshot

监控 ceph pool 可用空间
storage

监控 ceph io
io

监控 ceph bandwidth
bandwidth

监控 ceph OSD latency
lantency

原创粉丝点击