Greenplum 列存表(AO表)的膨胀、垃圾检查与空间收缩

来源：互联网发布：淘宝美工一般要做什么编辑：程序博客网时间：2024/06/06 00:59

转自digoal：https://github.com/digoal

背景

Greenplum支持行存储（堆存储）与AO存储，堆存储的垃圾回收和膨胀检测方法请参考：

《如何检测、清理Greenplum膨胀、垃圾 - 阿里云HybridDB for PG最佳实践》

对于AO存储，虽然是appendonly，但实际上GP是支持DELETE和UPDATE的，被删除或更新的行，通过BITMAP来标记。

AO存储是块级组织，当一个块内的数据大部分都被删除或更新掉时，扫描它浪费的成本实际上是很高的。

如何检查AO表的膨胀，收缩AO表呢？

如何查看表的存储结构

pg_class.relstorage表示这个对象是什么存储：

postgres=# select distinct relstorage from pg_class ;   relstorage   ------------   a  -- 行存储AO表   h  -- heap堆表、索引   x  -- 外部表(external table)   v  -- 视图   c  -- 列存储AO表  (5 rows)

查询当前数据库有哪些AO表：

postgres=# select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('c', 'a');   nspname  |      relname        ----------+-------------------   postgres | tbl_tag   postgres | tbl_pos_1_prt_p1   postgres | tbl_pos_1_prt_p2   postgres | tbl_pos_1_prt_p3   postgres | tbl_pos_1_prt_p4   postgres | tbl_pos_1_prt_p5   postgres | tbl_pos_1_prt_p6   postgres | tbl_pos_1_prt_p7   postgres | tbl_pos_1_prt_p8   postgres | tbl_pos_1_prt_p9   postgres | tbl_pos_1_prt_p10   postgres | tbl_pos   postgres | xx_czrk_qm_col   postgres | ao1  (14 rows)

查询当前数据库有哪些堆表：

select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('h') and relkind='r';

查看AO表的膨胀率(有多少垃圾)

使用这个函数可以查看AO表的膨胀率

postgres=# \df+ gp_toolkit.__gp_aovisimap_compaction_info  List of functions  -[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------  Schema              | gp_toolkit  Name                | __gp_aovisimap_compaction_info  Result data type    | SETOF record  Argument data types | ao_oid oid, OUT content integer, OUT datafile integer, OUT compaction_possible boolean, OUT hidden_tupcount bigint, OUT total_tupcount bigint, OUT percent_hidden numeric  Type                | normal  Data access         | no sql  Volatility          | volatile  Owner               | dege.zzz  Language            | plpgsql  Source code         |                       | DECLARE                      |     hinfo_row RECORD;                      |     threshold float;                      | BEGIN                      |     EXECUTE 'show gp_appendonly_compaction_threshold' INTO threshold;                      |     FOR hinfo_row IN SELECT gp_segment_id,                      |     gp_toolkit.__gp_aovisimap_hidden_typed(ao_oid)::gp_toolkit.__gp_aovisimap_hidden_t                      |     FROM gp_dist_random('gp_id') LOOP                      |         content := hinfo_row.gp_segment_id;                      |         datafile := (hinfo_row.__gp_aovisimap_hidden_typed).seg;                      |         hidden_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).hidden;                      |         total_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).total;                      |         compaction_possible := false;                      |         IF total_tupcount > 0 THEN                      |             percent_hidden := (100 * hidden_tupcount / total_tupcount::numeric)::numeric(5,2);                      |         ELSE                      |             percent_hidden := 0::numeric(5,2);                      |         END IF;                      |         IF percent_hidden > threshold THEN                      |             compaction_possible := true;                      |         END IF;                      |         RETURN NEXT;                      |     END LOOP;                      |     RAISE NOTICE 'gp_appendonly_compaction_threshold = %', threshold;                      |     RETURN;                      | END;                      |   Description         |

例子

postgres=# select * from gp_toolkit.__gp_aovisimap_compaction_info('postgres.ao1'::regclass);  NOTICE:  gp_appendonly_compaction_threshold = 10       content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden   ---------+----------+---------------------+-----------------+----------------+----------------         2 |        1 | f                   |              21 |         208283 |           0.01         9 |        1 | f                   |              25 |         208303 |           0.01        16 |        1 | f                   |              16 |         208352 |           0.01         4 |        1 | f                   |              23 |         208356 |           0.01         6 |        1 | f                   |              23 |         208386 |           0.01         3 |        1 | f                   |              14 |         208333 |           0.01         1 |        1 | f                   |              14 |         208329 |           0.01        14 |        1 | f                   |              12 |         208350 |           0.01        15 |        1 | f                   |              24 |         208346 |           0.01         7 |        1 | f                   |              22 |         208329 |           0.01         8 |        1 | f                   |              18 |         208334 |           0.01         0 |        1 | f                   |              21 |         208314 |           0.01        18 |        1 | f                   |              16 |         208417 |           0.01        11 |        1 | f                   |              24 |         208337 |           0.01        17 |        1 | f                   |              31 |         208380 |           0.01        12 |        1 | f                   |              12 |         208367 |           0.01        13 |        1 | f                   |              22 |         208365 |           0.01         5 |        1 | f                   |              22 |         208367 |           0.01        10 |        1 | f                   |              18 |         208347 |           0.01        20 |        1 | f                   |              17 |         208384 |           0.01        27 |        1 | f                   |              22 |         208348 |           0.01        19 |        1 | f                   |              31 |         208425 |           0.01        23 |        1 | f                   |              28 |         208344 |           0.01        26 |        1 | f                   |              14 |         208339 |           0.01        25 |        1 | f                   |              21 |         208386 |           0.01        24 |        1 | f                   |              14 |         208332 |           0.01        21 |        1 | f                   |              28 |         208360 |           0.01        33 |        1 | f                   |              18 |         208354 |           0.01        31 |        1 | f                   |              23 |         208335 |           0.01        22 |        1 | f                   |              17 |         208309 |           0.01        28 |        1 | f                   |              21 |         208314 |           0.01        29 |        1 | f                   |              23 |         208329 |           0.01        30 |        1 | f                   |              22 |         208342 |           0.01        35 |        1 | f                   |              18 |         208305 |           0.01        34 |        1 | f                   |              26 |         208363 |           0.01        32 |        1 | f                   |              25 |         208396 |           0.01        36 |        1 | f                   |              23 |         208323 |           0.01        38 |        1 | f                   |              22 |         208367 |           0.01        37 |        1 | f                   |              12 |         208334 |           0.01        39 |        1 | f                   |              12 |         208389 |           0.01        41 |        1 | f                   |              16 |         208373 |           0.01        42 |        1 | f                   |              30 |         208376 |           0.01        40 |        1 | f                   |              31 |         208397 |           0.01        43 |        1 | f                   |              16 |         208378 |           0.01        46 |        1 | f                   |              29 |         208371 |           0.01        47 |        1 | f                   |              16 |         208382 |           0.01        45 |        1 | f                   |              17 |         208368 |           0.01        44 |        1 | f                   |              29 |         208381 |           0.01  (48 rows)

解读：

1、content：对应gp_configuration.content，表示greenplum每个节点的唯一编号。

2、datafile：这条记录对应的这个表的其中一个数据文件的编号，每个数据文件假设1GB。

3、compaction_possible：这个数据文件是否可以被收缩。（通过gp_appendonly_compaction_threshold参数和percent_hidden值判断）。

4、hidden_tupcount：有多少条记录已更新或删除（不可见）。

5、total_tupcount：总共有多少条记录（包括已更新或删除的记录）。

6、percent_hidden：不可见记录的占比。如果这个占比大于gp_appendonly_compaction_threshold参数，那么执行vacuum时，会收缩这个数据文件。

检查系统中膨胀率超过N的AO表

膨胀率超过千分之2的AO表：

select * from (    select t2.nspname, t1.relname, (gp_toolkit.__gp_aovisimap_compaction_info(t1.oid)).*     from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('c', 'a')   ) t   where t.percent_hidden > 0.2;

返回结果

 nspname  |      relname      | content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden   ----------+-------------------+---------+----------+---------------------+-----------------+----------------+----------------   postgres | tbl_pos_1_prt_p1  |       1 |        1 | t                   |        20833382 |       20833382 |         100.00   postgres | tbl_pos_1_prt_p1  |       7 |        1 | t                   |        20833495 |       20833628 |         100.00   postgres | tbl_pos_1_prt_p1  |       5 |        1 | t                   |        20833628 |       20833495 |         100.00   postgres | tbl_pos_1_prt_p1  |       3 |        1 | t                   |        20833469 |       20833469 |         100.00  ....

显然膨胀了100%，验证如下（1条记录有效，占用了15GB空间）：

postgres=# select pg_size_pretty(pg_relation_size('tbl_pos_1_prt_p1'));   pg_size_pretty   ----------------   15 GB  (1 row)    postgres=# select count(*) from tbl_pos_1_prt_p1;   count   -------       1  (1 row)

vacuum可以直接收缩（因为膨胀率大于gp_appendonly_compaction_threshold参数）：

postgres=# vacuum tbl_pos_1_prt_p1;  VACUUM  postgres=# select count(*) from tbl_pos_1_prt_p1;   count   -------       1  (1 row)    postgres=# select pg_size_pretty(pg_relation_size('tbl_pos_1_prt_p1'));   pg_size_pretty   ----------------   40 bytes  (1 row)

VACUUM后，只占用40字节。

收缩已膨胀的AO表

方法有三：

1、执行VACUUM。（当膨胀率大于gp_appendonly_compaction_threshold参数时），为共享锁。

2、执行VACUUM FULL。（不管gp_appendonly_compaction_threshold参数的设置，都会回收垃圾空间。），为DDL锁。

3、执行重分布。（不管gp_appendonly_compaction_threshold参数，都会回收垃圾空间。），为DDL锁。

alter table <table_name> set with (reorganize=true) distributed randomly;        或        alter table <table_name> set with (reorganize=true) distributed by (<column_names1>,<column_names2>....)

set distribute可以回收索引的膨胀空间。set distribute 加载的锁与DDL锁类似，是排它锁。建议在没有业务的时候执行，不要堵塞业务。

同时set distribute只要分布条件不变，就是在节点内完成的，不会涉及数据的重分布。

建议的操作流程：

1 记录下表的分布列        2 执行set distribute (REORGANIZE=true)        如果是随机分布，则设置为随机分布

阅读全文

0 0