postgres pgpool PIRT online recovery steps

来源：互联网发布：新开淘宝店如何进货编辑：程序博客网时间：2024/06/06 00:51

postgres+pgpool实现在线恢复，实际上说白了就是手动恢复变成自动恢复。通过调用命令和脚本，pgpool控制多个node实现备份，恢复，重启，加载。

background

postgres 8.4

pgpool-II 3.4.2

复制模式+load balance

基于PIRT 实现在线恢复

安装好c语言函数

postgres实例都以archmod=on 模式运行

两台机器要实现postgres 用户ssh 无密码互访问

配置好pcp

配置好脚本，命令（脚本和命令参见pgpool手册，不过要小心调试）

from 192.168.0.10

to 192.168.0.11

1,准备，查看状态

#su - postgres

-bash-4.1$ psql -p 9999 -h192.168.0.10
Password:
psql (8.4.20)
Type "help" for help.

postgres=# show pool_nodes;
node_id |    hostname    | port | status | lb_weight | role
---------+----------------+------+--------+-----------+--------
0       | 192.168.0.10   | 5433 | 2      | 0.500000 | master
1       | 192.168.0.11 | 5433 | 2      | 0.500000 | slave
(2 rows)

2，打开日志

[root@iZ2864nri9hZ ~]# tail -f /var/log/pgpool/pgpool.log

[root@iZ2864nri9hZ ~]# cd /data/
[root@iZ2864nri9hZ data]# tail -f archive/recovery.log

3，停掉192.168.0.10 的postgres，并删除实例

[root@iZ28bh1trh9Z data]# ./postgrestop.sh
[root@iZ28bh1trh9Z data]# ps -ef |grep postgres
root 26503 26432 0 09:23 pts/1 00:00:00 grep postgres

postgres=# show pool_nodes;
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# show pool_nodes;
node_id |    hostname    | port | status | lb_weight | role
---------+----------------+------+--------+-----------+--------
0       | 192.168.0.10   | 5433 | 2      | 0.500000 | master
1       | 192.168.0.11 | 5433 | 3      | 0.500000 | slave
(2 rows)

这里看到 pgpool连接会断掉，并且 detatch 1 node 为 status=3

同时，如果有web程序连接pgpool,连接也会断，不过web程序会自动重连，重连需要几秒钟，应该是可以配置的？？

删除 pgdata_backup, pgdata.tar.gz

备份pgdata

root@iZ28bh1trh9Z data]# rm -rf pgdata_backup
[root@iZ28bh1trh9Z data]# mv pgdata pgdata_backup
[root@iZ28bh1trh9Z data]# rm pgdata.tar.gz

4,开始 online recovery

退出 pgpool的登录

postgres=# \q

运行命令

[root@iZ2864nri9hZ data]# pcp_recovery_node -d 50 localhost 9898 postgres postgres 1
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="c", len=20, data=CommandComplete
DEBUG: send: tos="X", len=4

运行过程中要查看

/var/log/pgpool/pgpool.log

/data/archive/recovery.log

5，结束检查

[root@iZ28bh1trh9Z data]# pwd
/alidata/data
[root@iZ28bh1trh9Z data]# ls
checkpgpool.sh pgbackup.sh pgdata pgdata_backup pgdata.tar.gz pgpoolstart.sh pgstartup.log postgrestart.sh postgrestop.sh

[root@iZ28bh1trh9Z data]# ps -ef |grep postgres
root     26550 1020 0 09:38 ?        00:00:00 sshd: postgres [priv]
postgres 26552 26550 0 09:38 ?        00:00:00 sshd: postgres@notty
postgres 26556     1 0 09:38 ?        00:00:00 /usr/bin/postgres -D /data/pgdata
postgres 26561 26556 0 09:38 ?        00:00:00 postgres: logger process
postgres 26577 26556 0 09:38 ?        00:00:00 postgres: writer process
postgres 26578 26556 0 09:38 ?        00:00:00 postgres: wal writer process
postgres 26579 26556 0 09:38 ?        00:00:00 postgres: autovacuum launcher process
postgres 26580 26556 0 09:38 ?        00:00:00 postgres: archiver process   last was 00000002.history
postgres 26581 26556 0 09:38 ?        00:00:00 postgres: stats collector process
postgres 26594 26556 0 09:40 ?        00:00:00 postgres: postgres postgres 192.168.0.10 (35555) idle
root     26609 26432 0 09:48 pts/1    00:00:00 grep postgres

-bash-4.1$ psql -p 9999 -h 192.168.0.10
Password:
psql (8.4.20)
Type "help" for help.

postgres=# show pool_nodes;
node_id |    hostname    | port | status | lb_weight | role
---------+----------------+------+--------+-----------+--------
0       | 192.168.0.10   | 5433 | 2      | 0.500000 | master
1       | 192.168.0.11 | 5433 | 2      | 0.500000 | slave
(2 rows)

可以看到两个node都已经启动

6，清理现场

0 0