ceph recovery 参数调研

来源:互联网 发布:怎么进入淘宝客买东西 编辑:程序博客网 时间:2024/05/17 13:07

目标

ceph recovery 时, 会占用大量带宽本文主要调研一下如何控制, 主要降低 ceph recovery 时的速度, IO 能力

查询某个 osd 当前最大读写能力

[root@hh-ceph-128214 ~]# ceph tell osd.12 bench{    "bytes_written": 1073741824,    "blocksize": 4194304,    "bytes_per_sec": 122277678}

recovery 常见检测

参数调整方法

单个 OSD 参数调整

[root@hh-ceph-128214 ~]# ceph daemon osd.12 config set debug_osd 10{    "success": ""}[root@hh-ceph-128214 ~]# ceph --admin-daemon /var/run/ceph/ceph-osd.12.asok config show | grep debug_osd    "debug_osd": "10/10",

针对所有 OSD 进行参数调整

[root@hh-ceph-128040 dizzy]# ceph tell osd.\* injectargs '--osd_max_backfills=1'osd.0: osd_max_backfills = '1'osd.1: osd_max_backfills = '1'osd.2: osd_max_backfills = '1'osd.3: osd_max_backfills = '1'osd.4: osd_max_backfills = '1'osd.5: osd_max_backfills = '1'osd.6: osd_max_backfills = '1'osd.7: osd_max_backfills = '1'osd.8: osd_max_backfills = '1'osd.9: osd_max_backfills = '1'osd.10: osd_max_backfills = '1'osd.11: osd_max_backfills = '1'osd.12: osd_max_backfills = '1'osd.13: osd_max_backfills = '1'osd.14: osd_max_backfills = '1'

查询当前参数方法

[root@hh-ceph-128040 dizzy]# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show  | grep backfill    "mon_osd_backfillfull_ratio": "0.900000",    "osd_backfill_retry_interval": "30.000000",    "osd_backfill_scan_max": "512",    "osd_backfill_scan_min": "64",    "osd_debug_reject_backfill_probability": "0.000000",    "osd_debug_skip_full_check_in_backfill_reservation": "false",    "osd_kill_backfill_at": "0",    "osd_max_backfills": "1",

IO 能力调整(针对 recovery)

recovery 信息

参考命令获得 recovery 信息

[root@hh-ceph-128214 ~]# ceph pg dump|grep recoveringdumped all3.20        332                  0      650         0       0 1391972352  767      767                        active+recovering+degraded 2017-12-01 17:32:16.343398  392'767  511:1495  [30,13,3]         30  [30,13,3]             30     392'31 2017-11-29 13:09:00.918493          330'12 2017-11-24 16:50:10.7497233.365       357                  0      721         0       0 1492619264  803      803                        active+recovering+degraded 2017-12-01 17:32:16.603756  392'803  511:1425  [4,15,30]          4  [4,15,30]              4     392'24 2017-11-29 15:07:21.700341          392'24 2017-11-29 15:07:21.7003413.428       335                485      485         0       0 1400328438 1666     1666                        active+recovering+degraded 2017-12-01 16:54:11.036479 392'1666    510:49  [16,6,31]         16  [16,6,31]             16    392'407 2017-11-30 06:03:19.380038             0'0 2017-11-23 16:34:44.6662693.524       383                144      144         0       0 1592238097  919      919                        active+recovering+degraded 2017-12-01 16:52:35.565517  392'919   510:817 [15,10,32]         15 [15,10,32]             15    392'397 2017-11-29 23:30:37.556655          391'34 2017-11-26 01:52:22.3707443.640       315                  0      639         0       0 1321177088  685      685                        active+recovering+degraded 2017-12-01 17:32:18.653057  392'685  511:1340  [2,19,34]          2  [2,19,34]              2    392'311 2017-11-30 08:50:52.272975          392'34 2017-11-28 22:58:28.172339

重要信息如下

[root@hh-ceph-128214 ~]# ceph pg dump|grep recovering|awk '{print $1,$2,$4,$10,$15,$16,$17,$18}'PG_STAT OBJECTS DEGRADED STATE UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB3.7d0 339 429 active+recovering+degraded [3,29,19] 3 [3,29,19] 33.713 320 456 active+recovering+degraded [0,30,15] 0 [0,30,15] 03.198 313 419 active+recovering+degraded [6,13,33] 6 [6,13,33] 63.428 335 485 active+recovering+degraded [16,6,31] 16 [16,6,31] 163.524 383 144 active+recovering+degraded [15,10,32] 15 [15,10,32] 15

watch 脚本

watch -n 1 -d "ceph pg dump|grep recovering|awk '{print \$1,\$2,\$4,\$10,\$15,\$16,\$17,\$18}'"

磁盘当前读写速度

安装软件

yum install -y dstat

监控状态

[root@hh-ceph-128214 ~]# dstat -td -D /dev/sdb time     | read  writ01-12 17:37:26| 758B 2382k01-12 17:37:27|   0  8228k01-12 17:37:28|   0    16M01-12 17:37:29|   0    24M

recovery 控制

默认参数

osd_max_backfills = 1osd_recovery_max_active = 3 osd_recovery_sleep = 0  "osd_disk_thread_ioprio_priority": "-1","osd_disk_threads": "1","osd_backfill_scan_max": "512","osd_backfill_scan_min": "64","osd_recovery_op_priority": "3","osd_recovery_max_active": "3",

默认状态

max recoveryjournal SSD单个 osd SATArecoverying pg80 ~ 150MB/s50 ~ 180MB/s20 ~ 60MB/s5 ~ 6个

调整参数

osd_max_backfill

osd_max_backfills = 2
max recoveryjournal SSD单个 osd SATArecoverying pg120 ~ 350MB/s180 ~ 340MB/s12 ~ 70MB/s20 ~ 22个

osd_recovery_max_active

osd_recovery_max_active = 10
max recoveryjournal SSD单个 osd SATArecoverying pg80 ~ 250MB/s50 ~ 260MB/s20 ~ 60MB/s5 ~ 10个

只调整这个参数, 效果不大, 必须配合 osd_max_backfills 进行调整

osd_recovery_sleep

osd_recovery_sleep = 0.5

这个主要用于降速

max recoveryjournal SSD单个 osd SATArecoverying pg30 ~ 50MB/s20 ~ 55MB/s12 ~ 40MB/s5 ~ 10个
原创粉丝点击