Openstack高可用之基础环境配置

来源：互联网发布：excel2007数据分列编辑：程序博客网时间：2024/05/18 03:21

作者：【吴业亮】云计算开发工程师
博客：http://blog.csdn.net/wylfengyujiancheng

一、操作系统配置
1.1、准备：
两个节点ha-node1和ha-node2均按照centos7.0系统

192.168.8.51 ha-node1192.168.8.52 ha-node2

修改主机名：
节点1

# hostnamectl set-hostname ha-node1# su -l

节点2

# hostnamectl set-hostname ha-node2# su -l

1.2、关闭防火墙（每个节点都需执行）

# setenforce 0# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config# systemctl disable firewalld.service# systemctl stop firewalld.service

1.3、配置hosts文件

echo '192.168.8.51 ha-node1 ' >>/etc/hostsecho '192.168.8.52 ha-node2 ' >>/etc/hosts

1.4、配置ntp（10.139.41.128为ntp服务器）每个节点都需执行

# chkconfig chronyd off# chkconfig ntpd on # sed -i "/^server\ 3.centos.pool/a server\ 10.239.41.128 " /etc/ntp.conf# service ntpd start# ntpq -p

二、安装集群软件
2.1、安装相关软件包

# yum install -y pacemaker pcs psmisc policycoreutils-python

启动pcs服务并保持开机启动

# systemctl start pcsd.service# systemctl enable pcsd.service

2.2修改用户hacluster的密码

# ssh ha-node2 'echo redhat1 | passwd --stdin hacluster'# echo redhat1 | passwd --stdin hacluster

注意：redhat1为hacluster用户密码
三、启动集群软件
3.1、认证各个节点，并创建集群（注意如果有代理记得取消）

# pcs cluster auth ha-node1 ha-node2# pcs cluster setup --name mycluster ha-node1 ha-node2

3.2、启动集群

[root@ha-node1 ~]# pcs cluster start --allha-node1: Starting Cluster...ha-node2: Starting Cluster...

3.3、验证corosync的安装

[root@ha-node1 ~]# corosync-cfgtool -sPrinting ring status.Local node ID 1RING ID 0id = 192.168.8.51status = ring 0 active with no faults

3.4、查看接入成员

#corosync-cmapctl | grep membersruntime.totem.pg.mrp.srp.members.1.config_version (u64) = 0runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.8.51)runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1runtime.totem.pg.mrp.srp.members.1.status (str) = joinedruntime.totem.pg.mrp.srp.members.2.config_version (u64) = 0runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.8.52)runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 2runtime.totem.pg.mrp.srp.members.2.status (str) = joined

3.5、corosync状态

# pcs status corosyncMembership information--------------------------Nodeid Votes Name1 1 ha-node1(local)2 1 ha-node2

3.6、检查pacemaker的安装

# ps axfPID TTY STAT TIME COMMAND2 ? S 0:00 [kthreadd]...lots of processes...1362 ? Ssl 0:35 corosync1379 ? Ss 0:00 /usr/sbin/pacemakerd -f1380 ? Ss 0:00 \_ /usr/libexec/pacemaker/cib1381 ? Ss 0:00 \_ /usr/libexec/pacemaker/stonithd1382 ? Ss 0:00 \_ /usr/libexec/pacemaker/lrmd1383 ? Ss 0:00 \_ /usr/libexec/pacemaker/attrd1384 ? Ss 0:00 \_ /usr/libexec/pacemaker/pengine1385 ? Ss 0:00 \_ /usr/libexec/pacemaker/crmd

3.7、检查pcs status

[root@ha-node1 ~]# pcs statusCluster name: myclusterWARNING: no stonith devices and stonith-enabled is not falseLast updated: Tue Dec 16 16:15:29 2014Last change: Tue Dec 16 15:49:47 2014Stack: corosyncCurrent DC: ha-node2 (2) - partition with quorumVersion: 1.1.12-a14efad2 Nodes configured0 Resources configuredOnline: [ ha-node1 ha-node2 ]Full list of resources:PCSD Status:ha-node1: Onlineha-node2: OnlineDaemon Status:corosync: active/disabledpacemaker: active/disabledpcsd: active/enabled

3.8、查看系统中error（stonith除外）

# journalctl | grep -i error

四、配置集群（任选一个节点）
4.1集群属性
投票属性

# pcs property set no-quorum-policy=ignore

集群故障时候服务迁移

# pcs resource defaults migration-threshold=1

由于两个节点无stonith设备

# pcs property set stonith-enabled=false

在node1恢复后，为防止node2资源迁回node01（迁来迁去对还是会对业务有一定影响）

# pcs resource defaults resource-stickiness=100# pcs resource defaults

设置资源超时时间

# pcs resource op defaults timeout=90s# pcs resource op defaults# pcs property set pe-warn-series-max=1000 \  pe-input-series-max=1000 \  pe-error-series-max=1000 \  cluster-recheck-interval=5min

验证，正常无回显

# crm_verify -L -V

4.2配置浮动IP

# pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.8.53 cidr_netmask=24 op monitor interval=30s

vip为自定义的一个集群IP的名称，监控时间为30S。
五、群集操作命令
5.1、验证群集安装

# pacemakerd -F ## 查看pacemaker组件，ps axf | grep pacemaker# corosync-cfgtool -s ## 查看corosync序号# corosync-cmapctl | grep members ## corosync 2.3.x# corosync-objctl | grep members ## corosync 1.4.x

5.2、查看群集资源

# pcs resource standards ## 查看支持资源类型# pcs resource providers ## 查看资源提供商# pcs resource agents ## 查看所有资源代理# pcs resource list ## 查看支持资源列表# pcs stonith list ## 查看支持Fence列表# pcs property list --all ## 显示群集默认变量参数# crm_simulate -sL ## 检验资源 score 值

5.3、使用群集脚本

 # pcs cluster cib ra_cfg ## 将群集资源配置信息保存在指定文件 # pcs -f ra_cfg resource create ## 创建群集资源并保存在指定文件中（而非保存在运行配置） # pcs -f ra_cfg resource show ## 显示指定文件的配置信息，检查无误后 # pcs cluster cib-push ra_cfg ## 将指定配置文件加载到运行配置中

5.4、STONITH 设备操作

# stonith_admin -I ## 查询fence设备 # stonith_admin -M -a agent_name ## 查询fence设备的元数据，stonith_admin -M -a fence_vmware_soap # stonith_admin --reboot nodename ## 测试 STONITH 设备

5.5、查看群集配置

# crm_verify -L -V ## 检查配置有无错误 # pcs property ## 查看群集属性 # pcs stonith ## 查看stonith # pcs constraint ## 查看资源约束 # pcs config ## 查看群集资源配置 # pcs cluster cib ## 以XML格式显示群集配置

5.6、管理群集

# pcs status ## 查看群集状态 # pcs status cluster # pcs status corosync # pcs cluster stop [node11] ## 停止群集 # pcs cluster start --all ## 启动群集 # pcs cluster standby node11 ## 将节点置为后备standby状态，pcs cluster unstandby node11 # pcs cluster destroy [--all] ## 删除群集，[--all]同时恢复corosync.conf文件 # pcs resource cleanup ClusterIP ## 清除指定资源的状态与错误计数 # pcs stonith cleanup vmware-fencing ## 清除Fence资源的状态与错误计数

参考：
https://docs.openstack.org/ha-guide/controller-ha-pacemaker.html

http://clusterlabs.org/

0 0