在CentOS 7 安装Calamari

来源:互联网 发布:荷兰蒂尔堡大学 知乎 编辑:程序博客网 时间:2024/05/22 07:51

Ceph是一款开源的SDS软件,对于开源安装好可以只是完成了第一步,后面的监控运维才是重点;要想直观的了解集群的运行状态,监控软件也就必不可少了,而对于Ceph的监控用得比较多的有Zabbix,inkScope,Calamari等。下文将详细说明Calamari在CentOS 7上的安装过程。

获取Calamari源码包

#> git clone https://github.com/ceph/calamari.git#> git clone https://github.com/ceph/calamari-clients.git#> git clone https://github.com/ceph/Diamond 

构建calamari server的rpm包

#> cd calamari#> yum remove prelink //避免安装时出现cpio Dismatch 错误#> ./build-rpm.sh

构建完成后会在父目录下的rpmbuild目录路径下生成rpm安装包

安装calamari server

#> cd ..    //从calamari目录退出到父目录#> yum localinstall rpmbuild/RPMS/x86_64/calamari-server-1.3.1.1-101_g945d16a.el6.x86_64.rpm

构建及安装calamari client

安装依赖

#> yum install npm ruby rubygems ruby-devel#> npm install -g grunt grunt-cli bower grunt-contrib-compass#> gem update --system && gem install compass

如果由于网络原因,gem源更新失败,请按如下方式处理:

#> gem sources #> gem sources -r https://rubygems.org/#> gem sources -a https://ruby.taobao.org/ #> gem sources -u

编译并安装calamari client

#> cd calamari-clients#> make build-real #> make dist     //会在上级目录生成 calamari-clients_1.2.2.tar.gz tar包#> cd ..         //返回calamari-client父目录#> tar -zxvf  calamari-clients_1.2.2.tar.gz   //解压#> mkdir -p /opt/calamari/webapp/content       //创建目录#> cd calamari-clients-1.2.2  //拷贝内容到下述目录#>for dir in manage admin login dashboard do     mkdir -p /opt/calamari/webapp/content/"$dir"    cp -pr "$dir"/dist/* /opt/calamari/webapp/content/"$dir"/done

如果make build-real 过程中出现如下因为网络原因下载依赖包失败的问题,请将对应文件中的依赖包下载地址替换为一个可用url,举例如下:

phantomjs@1.9.18 install /datapool/calamari-clients/manage/node_modules/karma-phantomjs-launcher/node_modules/phantomjs   //install.js目录> node install.jsDownloading https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.8-linux-x86_64.tar.bz2Saving to /datapool/calamari-clients/manage/node_modules/karma-phantomjs-launcher/node_modules/phantomjs/phantomjs/phantomjs-1.9.8-linux-x86_64.tar.bz2Receiving...Error making request.Error: connect ETIMEDOUT    //GFW导致的下载超时    at errnoException (net.js:905:11)    at Object.afterConnect [as oncomplete] (net.js:896:19)

上述错误提示下载依赖包phantomjs-1.9.8-linux-x86_64.tar.bz2失败,按如下方式替换install.js中的下载地址即可(这里使用淘宝的源):

var cdnUrl = process.env.npm_config_phantomjs_cdnurl || process.env.PHANTOMJS_CDNURL || 'http://npm.taobao.org/mirrors/phantomjs'

初始化calamari

经过上面的过程,calamari server及calamari client就安装完成,在首次使用calamari前需要先完成calamari的初始化,如下:

#> calamari-ctl initialize

如果初始化过程中,出现重启服务卡死,则需要升级supervisor到3.0以上:

 #> git clone https://github.com/Supervisor/supervisor.git #> cd supervisor && python setup.py install 

配置calamari server

配置防火墙

### for salt-master #> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 4505 -j ACCEPT #> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 4506 -j ACCEPT ### for carbon #> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 2003 -j ACCEPT #> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 2004 -j ACCEPT

配置saltstack认证

当ceph 节点上的salt-minion服务启动之后,会自动向salt-master请求认证。在Calamari server上可以通过下面的命令查看salt-minion密钥的列表:

#> salt-key -L

刚刚启动salt-minion服务的ceph 节点会出现在Unaccepted Keys列表之后,要使得Calamari能够通过saltstack管理ceph 节点,需要对这些密钥进行认证:

#> salt-key -A

安装diamond及salt-minion

构建diamond rpm包

#> cd Diamond#> git checkout origin/calamari#> make rpm      //在dist目录下生成diamond-3.4.67-0.noarch.rpm 包

在所有ceph节点上安装salt-minion及diamond

首先将刚才构建的diamond rpm包拷贝到所有的ceph节点上,执行下述命令安装相关的软件包:

#> yum localinstall diamond-3.4.67-0.noarch.rpm#> yum install -y salt-minion

在所有ceph节点上配置salt-minion并启动

#> touch /etc/salt/minion.d/calamari.conf ###calamari-server-name为calamari 服务器的地址(ip或域名)#> echo "master: {calamari-server-name}" > /etc/salt/minion.d/calamari.conf    ### :与后面的地址间有个空格#> echo "master: {calamari-server-name}" >> /etc/salt/minion                   #> service salt-minion restart #> service diamond start 

如果启动diamond失败,查看diamond日志有如下错误:

#> tail -f /var/log/diamond/diamond.log[2015-11-03 19:06:35,044] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load

请 以root用户运行diamond服务,修改如下:

#> echo "user=root,group=root" >> /etc/diamond/diamond.conf

至此calamari监控就安装配置完成了,你可以再web上看到ceph集群的状态了。如果人品刚好有点问题,那就继续看下面的故障处理吧!

踩过的那些坑

Q:diamond日志文件中出现如下的错误:

-- Unit diamond.service has begun starting up.Nov 03 18:46:27 ceph-osd2 diamond[3790]: Failed to acquire lockfile: /var/lock/subsys/diamond.Nov 03 18:46:27 ceph-osd2 diamond[3790]: Held by 14377Nov 03 18:46:27 ceph-osd2 diamond[3790]: [FAILED]Nov 03 18:46:27 ceph-osd2 systemd[1]: diamond.service: control process exited, code=exited status=1Nov 03 18:46:27 ceph-osd2 systemd[1]: Failed to start LSB: System statistics collector for Graphite.

删除/var/lock/subsys目录下面的文件就好了:

#> rm -f /var/lock/subsys/*

Q: 打开网页出现500错误
1)有可能是cthulhu-manager进程没有起来,重启supervisor就好

#> /usr/bin/python /usr/bin/supervisord -c /etc/supervisord.conf

2)有可能是写日志没有权限/var/log/calamari/*

#> chmod 766 /var && chmod -R 766 /var/calamari

Q: 打开/dashboard/页面总是报internal server error(5),并有如下错误日志:

 #> vi /var/log/calamari/calamari.log 2015-11-05 20:25:39,252 - ERROR - django.request Internal Server Error: /api/v1/cluster/4a4dd60f-c8bb-4982-a1b4-9b891f78c30b/osdTraceback (most recent call last):  File "/opt/calamari/venv/lib/python2.6/site-packages/django/core/handlers/base.py", line 117, in get_response    response = callback(request, *callback_args, **callback_kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/viewsets.py", line 78, in view    return self.dispatch(request, *args, **kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/rpc_view.py", line 94, in dispatch    return super(RPCViewSet, self).dispatch(request, *args, **kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/django/views/decorators/csrf.py", line 77, in wrapped_view    return view_func(*args, **kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/views.py", line 399, in dispatch    response = self.handle_exception(exc)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/rpc_view.py", line 111, in handle_exception    return super(RPCViewSet, self).handle_exception(exc)  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/views.py", line 396, in dispatch    response = handler(request, *args, **kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py", line 423, in get    osds, osds_by_pg_state = self.generate(pg_summary, osd_map, server_info, servers)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py", line 371, in generate    for osd_id, osd_pg_summary in pg_summary['by_osd'].items():TypeError: 'NoneType' object is unsubscriptable#> vi /var/log/calamari/cthulhu.log  2015-11-04 17:38:59,278 - ERROR - cthulhu Exception handling message with tag ceph/cluster/4a4dd60f-c8bb-4982-a1b4-9b891f78c30bTraceback (most recent call last):  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 244, in _run    self.on_heartbeat(data['id'], data['data'])  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/gevent_util.py", line 35, in wrapped    return func(*args, **kwargs)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 346, in on_heartbeat    cluster_data['versions'][sync_type.str])  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 99, in on_version    self.fetch(reported_by, sync_type)  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 109, in fetch    client = LocalClient(config.get('cthulhu', 'salt_config_path'))  File "/usr/lib/python2.6/site-packages/salt/client/__init__.py", line 136, in __init__    listen=not self.opts.get('__worker', False))  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 114, in get_event    return MasterEvent(sock_dir, opts)  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 559, in __init__    super(MasterEvent, self).__init__('master', sock_dir, opts)  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 181, in __init__    self.get_event(wait=1)  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 410, in get_event    ret = self._get_event(wait, tag, tags_regex)  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 351, in _get_event    socks = dict(self.poller.poll(wait * 1000))  File "/opt/calamari/venv/lib/python2.6/site-packages/zmq/green/poll.py", line 81, in poll    select.select(rlist, wlist, xlist)  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/select.py", line 68, in select    result.event.wait(timeout=timeout)  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/event.py", line 77, in wait    result = self.hub.switch()  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/hub.py", line 337, in switch    switch_out()  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/gevent_util.py", line 15, in asserter    raise ForbiddenYield("Context switch during `nosleep` region!")

出现该问题,是因为saltstack(salt, salt-master, salt-minion)与calamari不兼容;在CentOS 7系统上默认安装salt-2015.5.5这个版本,
为解决该问题,需卸载该版本,并安装salt-2014.1.x版本问题解决, 我安装salt-2014.1.13-1工作正常;salt可以从如下站点下载: http://rpmfind.net/

Q: diamond日志文件中出现如下错误:

#> tail -f /var/log/diamond/diamond.log GraphiteHandler: Failed to connect to 10.168.122.165:2003. timed out.

1).可能是防火墙问题,查看防火墙设置即可
2).可能是路由问题, 查看路由设置即可
3).可能服务端cthulhu-manager进程没有启动,启动该进程即可

Q:安装salt-minion报如下错误:

Error: Package: python-msgpack-0.4.6-1.el6.x86_64 (epel)           Requires: python(abi) = 2.6           Installed: python-2.7.5-16.el7.x86_64 (@anaconda)               python(abi) = 2.7               python(abi) = 2.7

再安装一个python2.6,然后安装salt-minion的时候指定python版本为2.6

configure --with-python2.6=/usr/local/python2.6
0 0
原创粉丝点击