PostgreSQL 10 流式物理、逻辑主从 最佳实践
来源:互联网 发布:辐射4男主角捏脸数据 编辑:程序博客网 时间:2024/06/07 10:27
PostgreSQL 10 流式物理、逻辑主从 最佳实践
作者
digoal
日期
2017-07-11
标签
PostgreSQL , 流复制 , 主从 , 逻辑订阅
背景
流复制起源
PostgreSQL 自从2010年推出的9.0版本开始,支持流式物理复制,用户可以通过流式复制,构建只读备库(主备物理复制,块级别一致)。流式物理复制可以做到极低的延迟(通常在1毫秒以内)。
同步流复制
2011年推出的9.1版本,支持同步复制,当时只支持一个同步流复制备节点(例如配置了3个备,只有一个是同步模式的,其他都是异步模式)。
在同步复制模式下,当用户提交事务时,需要等待这笔事务的WAL日志复制到同步流复制备节点,才会返回提交成功的ACK给客户端。
同步模式下,可以确保数据的0丢失。(只要客户端收到了事务提交成功的ACK,这笔事务的WAL就有两份。)
级联流复制
2012年推出的9.2版本,支持级联流复制。意思是备库还可以再连备库。
级联复制特别适合跨机房的使用,例如主库在A机房,备库在B机房,但是B机房需要建立多个备库时,那么B机房只需要建立一个直连主库的备库,其他的备库可以通过B机房的这个备库级联产生。从而减少网络开销。
流式虚拟备库
2012年推出的9.2版本,除了支持级联复制,还支持虚拟备库,什么是虚拟备库呢?就是只有WAL,没有数据文件的备库。
通过虚拟备库,可以流式的接收WAL,进行实时的流式WAL归档。提高备份或归档的实时性。
逻辑复制的基础
2014年推出的9.4版本,在WAL中增加了逻辑复制需要的基础信息,通过插件,可以实现逻辑复制。
逻辑复制可以做到对主库的部分复制,例如表级复制,而不是整个集群的块级一致复制。
逻辑复制的备库不仅仅是只读的,也可以执行写操作。
增加几种同步级别
2016年推出的9.6版本,PG的流式复制,通过复制WAL达到同步的目的,因此同步级别也和WAL有关。通过synchronous_commit参数,可以配置事务的同步级别。
1、on, 表示本地WAL fsync,同步standby WAL fsync。即两份持久化的WAL。
2、remote_apply, 表示本地WAL fsync,同步standby WAL 已恢复。这个带来的RT最高。
3、remote_write, 表示本地WAL fsync,同步standby WAL 异步write完成。一份持久化,备库的WAL可能还在OS CACHE中。
4、local, 表示本地WAL fsync。
5、off, 表示本地WAL写到wal buffer中即返回客户端事务提交成功的ACK,为异步提交(数据库CRASH可能导致事务丢失,但不会导致数据库不一致)。
RT影响,从低到高如下:
off, local, remote_write, on, remote_apply。
流式备份压缩
2017年推出的10版本,pg_basebackup, pg_receivewal支持流式压缩备份WAL。
quorum based 同步流复制
2017年推出的10版本,支持quorum based的同步流复制,例如有3个备节点,你可以告诉主库,这个事务需要3份WAL副本,那么主库需要等待至少2个备节点已将WAL同步过去的反馈,才会将事务提交成功的ACK返回给客户端。
quorum based同步流复制,结合raft协议,可以实现零数据丢失的高可用、高可靠架构。
内置逻辑订阅、复制
2017年推出的10版本,内置了逻辑订阅的功能。
多master
2017年推出的10版本,通过逻辑订阅的功能,可以实现多Master架构。
一、流式 物理主从 最佳实践
以一主两从,quorum based 同步(一副本)为例,介绍PG 10的主从最佳实践。
环境
三台机器(假设主机有64G内存),同一局域网,相互网络互通(至少数据库监听端口应该互通)。
Linux CentOS 7.x x64
HOSTA : 监听端口1921
HOSTB : 监听端口1921
HOSTC : 监听端口1921
(本文使用一台物理机来模拟,IP为127.0.0.1,端口分别为2921,2922,2923,读者请根据实际环境修改)
软件安装略
请参考
《PostgreSQL on Linux 最佳部署手册》
初始化主库
mkdir /disk1/digoal/pgdata2921 chown digoal /disk1/digoal/pgdata2921 initdb -D /disk1/digoal/pgdata2921 -E UTF8 --locale=C -U postgres
配置postgresql.conf
cd /disk1/digoal/pgdata2921 vi postgresql.conf listen_addresses = '0.0.0.0' port = 2921 max_connections = 1000 unix_socket_directories = '.' tcp_keepalives_idle = 60 tcp_keepalives_interval = 10 tcp_keepalives_count = 10 shared_buffers = 8GB maintenance_work_mem = 1GB dynamic_shared_memory_type = posix vacuum_cost_delay = 0 bgwriter_delay = 10ms bgwriter_lru_maxpages = 500 bgwriter_lru_multiplier = 5.0 bgwriter_flush_after = 0 effective_io_concurrency = 0 max_worker_processes = 16 backend_flush_after = 0 wal_level = replica fsync = on synchronous_commit = remote_write full_page_writes = on wal_buffers = 128MB wal_writer_delay = 10ms wal_writer_flush_after = 0 checkpoint_timeout = 30min max_wal_size = 16GB min_wal_size = 8GB checkpoint_completion_target = 0.5 checkpoint_flush_after = 0 max_wal_senders = 10 wal_keep_segments = 1024 synchronous_standby_names = 'ANY 1 (*)' hot_standby = on max_standby_archive_delay = 300s max_standby_streaming_delay = 300s wal_receiver_status_interval = 1s hot_standby_feedback = off log_destination = 'csvlog' logging_collector = on log_checkpoints = on log_connections = on log_disconnections = on log_error_verbosity = verbose log_timezone = 'PRC' log_autovacuum_min_duration = 0 autovacuum_vacuum_scale_factor = 0.1 autovacuum_analyze_scale_factor = 0.1 autovacuum_freeze_max_age = 1000000000 autovacuum_multixact_freeze_max_age = 1200000000 autovacuum_vacuum_cost_delay = 0 autovacuum_vacuum_cost_limit = 0 vacuum_freeze_table_age = 800000000 vacuum_multixact_freeze_table_age = 800000000 datestyle = 'iso, mdy' timezone = 'PRC' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' default_text_search_config = 'pg_catalog.english'
配置pg_hba.conf
cd /disk1/digoal/pgdata2921 vi pg_hba.conf # "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust # IPv6 local connections: host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. local replication all trust host replication all 127.0.0.1/32 trust host replication all ::1/128 trust host replication all 0.0.0.0/0 md5
配置recovery.done
cd /disk1/digoal/pgdata2921 vi recovery.done recovery_target_timeline = 'latest' standby_mode = on primary_conninfo = 'host=127.0.0.1 port=2921 user=rep password=pwd' # recovery_min_apply_delay = 0 #延迟多少分钟应用,用户可以配置延迟的备库,例如给一点误操作的缓冲时间。在备库不会这么早被应用。
启动主库
pg_ctl start -D /disk1/digoal/pgdata2921
创建流复制角色
psql -h 127.0.0.1 -p 2921 psql (10beta1) Type "help" for help. postgres=# set synchronous_commit =off; SET postgres=# create role rep login replication encrypted password 'pwd'; CREATE ROLE
生成备库1
mkdir /disk1/digoal/pgdata2922 chown digoal /disk1/digoal/pgdata2922 chmod 700 /disk1/digoal/pgdata2922 export PGPASSWORD="pwd" pg_basebackup -D /disk1/digoal/pgdata2922 -F p -X stream -h 127.0.0.1 -p 2921 -U rep
配置备库1 postgresql.conf
cd /disk1/digoal/pgdata2922 vi postgresql.conf port = 2922
配置备库1 recovery.conf
cd /disk1/digoal/pgdata2922 mv recovery.done recovery.conf
启动备库1
pg_ctl start -D /disk1/digoal/pgdata2922
生成备库2
mkdir /disk1/digoal/pgdata2923 chown digoal /disk1/digoal/pgdata2923 chmod 700 /disk1/digoal/pgdata2923 export PGPASSWORD="pwd" pg_basebackup -D /disk1/digoal/pgdata2923 -F p -X stream -h 127.0.0.1 -p 2921 -U rep
配置备库2 postgresql.conf
cd /disk1/digoal/pgdata2923 vi postgresql.conf port = 2923
配置备库2 recovery.conf
cd /disk1/digoal/pgdata2923 mv recovery.done recovery.conf
启动备库2
pg_ctl start -D /disk1/digoal/pgdata2923
流复制节点的状态监控
主库查询
psql -h 127.0.0.1 -p 2921 psql (10beta1) Type "help" for help. postgres=# \x Expanded display is on. postgres=# select pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn)) as sent_delay, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), write_lsn)) as write_delay, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn)) as flush_delay, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) as replay_delay, * from pg_stat_replication; -[ RECORD 1 ]----+------------------------------ sent_delay | 0 bytes write_delay | 0 bytes flush_delay | 0 bytes replay_delay | 0 bytes pid | 11962 usesysid | 16384 usename | rep application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 63083 backend_start | 2017-07-11 17:15:31.231492+08 backend_xmin | state | streaming sent_lsn | 1/C0000060 write_lsn | 1/C0000060 flush_lsn | 1/C0000060 replay_lsn | 1/C0000060 write_lag | flush_lag | replay_lag | sync_priority | 1 sync_state | quorum -[ RECORD 2 ]----+------------------------------ sent_delay | 0 bytes write_delay | 0 bytes flush_delay | 0 bytes replay_delay | 0 bytes pid | 11350 usesysid | 16384 usename | rep application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 63077 backend_start | 2017-07-11 17:15:13.818043+08 backend_xmin | state | streaming sent_lsn | 1/C0000060 write_lsn | 1/C0000060 flush_lsn | 1/C0000060 replay_lsn | 1/C0000060 write_lag | flush_lag | replay_lag | sync_priority | 1 sync_state | quorum
备库查询
psql -h 127.0.0.1 -p 2922 -- 查看当前WAL应用是否暂停了 postgres=# select pg_is_wal_replay_paused(); pg_is_wal_replay_paused ------------------------- f (1 row) -- 查看WAL接收到的位点 postgres=# select pg_last_wal_receive_lsn(); pg_last_wal_receive_lsn ------------------------- 1/C0000060 (1 row) -- 查看WAL的应用位点 postgres=# select pg_last_wal_replay_lsn(); pg_last_wal_replay_lsn ------------------------ 1/C0000060 (1 row) -- 查看wal receiver的统计信息 postgres=# \x Expanded display is on. postgres=# select * from pg_stat_get_wal_receiver(); -[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------- pid | 11349 status | streaming receive_start_lsn | 1/C0000000 receive_start_tli | 1 received_lsn | 1/C0000060 received_tli | 1 last_msg_send_time | 2017-07-11 17:23:14.372327+08 last_msg_receipt_time | 2017-07-11 17:23:14.372361+08 latest_end_lsn | 1/C0000060 latest_end_time | 2017-07-11 17:15:13.819553+08 slot_name | conninfo | user=rep password=******** dbname=replication host=127.0.0.1 port=2921 fallback_application_name=walreceiver sslmode=disable sslcompression=1 target_session_attrs=any postgres=# select pg_wal_replay_pause(); -[ RECORD 1 ]-------+- pg_wal_replay_pause | -- 暂停WAL的应用,例如要做一些排错时 postgres=# select pg_is_wal_replay_paused(); -[ RECORD 1 ]-----------+-- pg_is_wal_replay_paused | t postgres=# select pg_wal_replay_resume(); -[ RECORD 1 ]--------+- pg_wal_replay_resume | -- 继续应用WAL postgres=# select pg_is_wal_replay_paused(); -[ RECORD 1 ]-----------+-- pg_is_wal_replay_paused | f
注意事项
1、如果要防止主库删除备库还没有接收的WAL文件,有两种方法。
使用slot,或者配置足够大的wal keep。
但是这两种方法都有一定的风险或问题,例如当备库挂了,或者备库不再使用了,而用户忘记删除对应的SLOT时。可能导致主库WAL无限膨胀。
而wal keep则会导致主库的WAL预留足够的个数,占用一定空间。
相关参数
主 postgresql.conf # max_replication_slots = 10 # wal_keep_segments = 1024 备 recovery.conf # primary_slot_name = ''
2、如果不想通过以上方法预防备库需要的WAL已被删除,那么可以配置主库的归档,同时备库需要能获取到已归档的WAL文件。
相关参数
主 postgresql.conf #archive_mode = off # enables archiving; off, on, or always # (change requires restart) #archive_command = '' # command to use to archive a logfile segment # placeholders: %p = path of file to archive # %f = file name only # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' 备 recovery.conf # restore_command = '' # e.g. 'cp /mnt/server/archivedir/%f %p'
3、保护好recovery.conf文件中的密码,因为配置的是明文。
4、主备之间的带宽请足够大,否则可能导致主备延迟。
压测
连接主库进行TPC-B的压测
pgbench -i -s 100 -h 127.0.0.1 -p 2921 -U postgres pgbench -n -r -P 1 -h 127.0.0.1 -p 2921 -U postgres -c 32 -j 32 -T 120 transaction type: <builtin: TPC-B (sort of)>scaling factor: 100query mode: simplenumber of clients: 32number of threads: 32duration: 120 snumber of transactions actually processed: 1326066latency average = 2.896 mslatency stddev = 2.030 mstps = 11050.199659 (including connections establishing)tps = 11051.140876 (excluding connections establishing)script statistics: - statement latencies in milliseconds: 0.002 \set aid random(1, 100000 * :scale) 0.001 \set bid random(1, 1 * :scale) 0.001 \set tid random(1, 10 * :scale) 0.000 \set delta random(-5000, 5000) 0.043 BEGIN; 0.154 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.112 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 0.159 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 0.423 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.092 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 1.910 END;
观察主备的延迟
psql -h 127.0.0.1 -p 2921 psql (10beta1) Type "help" for help. postgres=# \x Expanded display is on. postgres=# select pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn)) as sent_delay, postgres-# pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), write_lsn)) as write_delay, postgres-# pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn)) as flush_delay, postgres-# pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) as replay_delay, postgres-# * postgres-# from pg_stat_replication; -[ RECORD 1 ]----+------------------------------ sent_delay | 4024 bytes write_delay | 4024 bytes flush_delay | 9080 bytes replay_delay | 13 kB pid | 11962 usesysid | 16384 usename | rep application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 63083 backend_start | 2017-07-11 17:15:31.231492+08 backend_xmin | state | streaming sent_lsn | 2/1C61E398 write_lsn | 2/1C61E398 flush_lsn | 2/1C61CFD8 replay_lsn | 2/1C61BEF8 write_lag | 00:00:00.000129 flush_lag | 00:00:00.001106 replay_lag | 00:00:00.001626 sync_priority | 1 sync_state | quorum -[ RECORD 2 ]----+------------------------------ sent_delay | 0 bytes write_delay | 4024 bytes flush_delay | 9080 bytes replay_delay | 12 kB pid | 11350 usesysid | 16384 usename | rep application_name | walreceiver client_addr | 127.0.0.1 client_hostname | client_port | 63077 backend_start | 2017-07-11 17:15:13.818043+08 backend_xmin | state | streaming sent_lsn | 2/1C61F350 write_lsn | 2/1C61E398 flush_lsn | 2/1C61CFD8 replay_lsn | 2/1C61C388 write_lag | 00:00:00.000542 flush_lag | 00:00:00.001582 replay_lag | 00:00:00.001952 sync_priority | 1 sync_state | quorum postgres=# \watch 1
- PostgreSQL 10 流式物理、逻辑主从 最佳实践
- 物理系统最佳实践
- 【入门教程】SequoiaDB+Postgresql数据实时检索最佳实践
- postgresql实践
- 隐式转换最佳实践
- Perl 最佳实践(节选) --- 10
- postgresql主从同步
- postgresql部署[主从搭建]
- 最佳实践
- 最佳实践
- 最佳实践
- 全文检索 不包含 优化- 阿里云RDS PostgreSQL最佳实践
- oracle物理、逻辑结构
- 物理CPU 逻辑CPU
- 逻辑与物理分页
- Postgresql逻辑结构整理
- PostgreSQL 逻辑订阅
- PostgreSQL物理备份与恢复
- python.snippet
- 字符串字符数组的赋值
- ios中提示框的使用
- 【数据处理】R语言--data.table介绍以及例子
- 选择排序
- PostgreSQL 10 流式物理、逻辑主从 最佳实践
- GC的算法和种类
- 负载均衡的几种算法Java实现代码
- JDBC 关闭顺序问题
- 我所理解的设计模式(C++实现)——命令模式(Command Pattern)
- 【如何快速的开发一个完整的iOS直播app】(原理篇)
- Dubbo项目启动后,没有提供者。项目启动日志正常,DUBBO服务启动没有注册到zookeeper。
- JAVA中System.getProperty()方法使用
- 深度优先搜索求最大流