PostgreSQL partition table's arithmetic tuning example

来源:互联网 发布:数据分析师待遇怎么样 编辑:程序博客网 时间:2024/06/05 16:48

在PostgreSQL中,针对数据量较大的表,建议采用分区表的模式。
分区算法可以通过触发器来实现,一个合适的算法可以降低对数据库CPU的开销。尤其在对分区表并发量大的操作时效果明显。
分区环境示例:
parent table :
                      Table "digoal.tbl_user_info"
  Column   |         Type          | Modifiers | Storage  | Description
-----------+-----------------------+-----------+----------+-------------
 id        | bigint                | not null  | plain    |
 firstname | character varying(32) |           | extended |
 lastname  | character varying(32) |           | extended |
 corp      | character varying(64) |           | extended |
 age       | integer               |           | plain    |
Indexes:
    "tbl_user_info_pkey" PRIMARY KEY, btree (id)
Child tables: tbl_user_info_0,
              tbl_user_info_1,
              tbl_user_info_10,
              tbl_user_info_11,
              tbl_user_info_12,
              tbl_user_info_13,
              tbl_user_info_14,
              tbl_user_info_15,
              tbl_user_info_2,
              tbl_user_info_3,
              tbl_user_info_4,
              tbl_user_info_5,
              tbl_user_info_6,
              tbl_user_info_7,
              tbl_user_info_8,
              tbl_user_info_9
Has OIDs: no

建表函数 :
CREATE OR REPLACE FUNCTION digoal.f_create_table(i_min integer, i_max integer)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare
begin
for i in i_min..i_max loop
execute 'create table tbl_user_info_'||i||' (like tbl_user_info including constraints including defaults including indexes) inherits (tbl_user_info);';
end loop;
return;
end;
$function$

添加CHECK函数:
CREATE OR REPLACE FUNCTION digoal.f_ck_table(i_min integer, i_max integer)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare
begin
for i in i_min..i_max loop
execute 'alter table tbl_user_info_'||i||' add constraint ck_user_info_'||i||' check (mod(id,16)='||i||');';
end loop;
return;
end;
$function$

触发器函数使用4种组合,分别测试性能差别。

示例图 :

PostgreSQL partition tables arithmetic tuning example - 德哥@Digoal - The Heart,The World.
 
 
PostgreSQL partition tables arithmetic tuning example - 德哥@Digoal - The Heart,The World.
 
 
PostgreSQL partition tables arithmetic tuning example - 德哥@Digoal - The Heart,The World.
 
 
PostgreSQL partition tables arithmetic tuning example - 德哥@Digoal - The Heart,The World.
 
 
PostgreSQL partition tables arithmetic tuning example - 德哥@Digoal - The Heart,The World.
 

 
一、不分组的范例 :
新建触发器函数和触发器
CREATE OR REPLACE FUNCTION tbl_user_info_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( mod(NEW.id,16) = 0 ) THEN
        INSERT INTO tbl_user_info_0 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 1 ) THEN
        INSERT INTO tbl_user_info_1 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 2 ) THEN
        INSERT INTO tbl_user_info_2 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 3 ) THEN
        INSERT INTO tbl_user_info_3 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 4 ) THEN
        INSERT INTO tbl_user_info_4 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 5 ) THEN
        INSERT INTO tbl_user_info_5 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 6 ) THEN
        INSERT INTO tbl_user_info_6 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 7 ) THEN
        INSERT INTO tbl_user_info_7 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 8 ) THEN
        INSERT INTO tbl_user_info_8 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 9 ) THEN
        INSERT INTO tbl_user_info_9 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 10 ) THEN
        INSERT INTO tbl_user_info_10 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 11 ) THEN
        INSERT INTO tbl_user_info_11 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 12 ) THEN
        INSERT INTO tbl_user_info_12 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 13 ) THEN
        INSERT INTO tbl_user_info_13 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 14 ) THEN
        INSERT INTO tbl_user_info_14 VALUES (NEW.*);
    ELSIF ( mod(NEW.id,16) = 15 ) THEN
        INSERT INTO tbl_user_info_15 VALUES (NEW.*);
    ELSE
        RAISE EXCEPTION 'ID out of range.  Please fix the tbl_user_info_insert_trigger() function!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION tbl_user_info_delete_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( mod(OLD.id,16) = 0 ) THEN
        DELETE FROM tbl_user_info_0 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 1 ) THEN
        DELETE FROM tbl_user_info_1 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 2 ) THEN
        DELETE FROM tbl_user_info_2 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 3 ) THEN
        DELETE FROM tbl_user_info_3 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 4 ) THEN
        DELETE FROM tbl_user_info_4 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 5 ) THEN
        DELETE FROM tbl_user_info_5 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 6 ) THEN
        DELETE FROM tbl_user_info_6 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 7 ) THEN
        DELETE FROM tbl_user_info_7 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 8 ) THEN
        DELETE FROM tbl_user_info_8 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 9 ) THEN
        DELETE FROM tbl_user_info_9 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 10 ) THEN
        DELETE FROM tbl_user_info_10 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 11 ) THEN
        DELETE FROM tbl_user_info_11 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 12 ) THEN
        DELETE FROM tbl_user_info_12 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 13 ) THEN
        DELETE FROM tbl_user_info_13 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 14 ) THEN
        DELETE FROM tbl_user_info_14 where id=OLD.id;
    ELSIF ( mod(OLD.id,16) = 15 ) THEN
        DELETE FROM tbl_user_info_15 where id=OLD.id;
    ELSE
        RAISE EXCEPTION 'ID out of range.  Please fix the tbl_user_info_delete_trigger() function!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;



CREATE TRIGGER insert_tbl_user_info_trigger
    BEFORE INSERT ON tbl_user_info
    FOR EACH ROW EXECUTE PROCEDURE tbl_user_info_insert_trigger();

CREATE TRIGGER delete_tbl_user_info_trigger
    BEFORE DELETE ON tbl_user_info
    FOR EACH ROW EXECUTE PROCEDURE tbl_user_info_delete_trigger();

极端性能测试:
digoal=> insert into tbl_user_info select generate_series(0,16000000,16),'zhou','digoal','sky-mobi',27;
INSERT 0 0
Time: 34749.758 ms
digoal=> truncate table tbl_user_info ;
TRUNCATE TABLE
Time: 495.907 ms
digoal=> insert into tbl_user_info select generate_series(15,16000000,16),'zhou','digoal','sky-mobi',27;
INSERT 0 0
Time: 43517.893 ms
digoal=> select count(*) from tbl_user_info;
  count 
---------
 1000000
(准确抵达目标表)
digoal=> select count(*) from tbl_user_info_15;
  count 
---------
 1000000
单个操作下第一个判断和最后一个判断时间相差 (43517.893-34749.758)/100w = 0.008768135 ms

不使用触发器的情况下,
digoal=> insert into tbl_user_info_single select generate_series(0,16000000,16),'zhou','digoal','sky-mobi',27;
INSERT 0 1000001
Time: 3959.861 ms

单个判断至少消耗 = (34749.758-3959.861)/1000000 = 0.030789897 ms
至多消耗 = 0.039558032 ms

二、分两个组的情况
更新触发器函数:
CREATE OR REPLACE FUNCTION tbl_user_info_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( mod(NEW.id,2) = 0 ) THEN
       IF    ( mod(NEW.id,16) = 0 ) THEN
        INSERT INTO tbl_user_info_0 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 2 ) THEN
        INSERT INTO tbl_user_info_2 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 4 ) THEN
        INSERT INTO tbl_user_info_4 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 6 ) THEN
        INSERT INTO tbl_user_info_6 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 8 ) THEN
        INSERT INTO tbl_user_info_8 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 10 ) THEN
        INSERT INTO tbl_user_info_10 VALUES (NEW.*);
       ELSIF ( mod(NEW.id,16) = 12 ) THEN
        INSERT INTO tbl_user_info_12 VALUES (NEW.*);
       ELSE
        INSERT INTO tbl_user_info_14 VALUES (NEW.*);
       END IF;
    ELSIF ( mod(NEW.id,2) = 1 ) THEN
       IF    ( mod(NEW.id,16) = 1 ) THEN
        INSERT INTO tbl_user_info_1 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 3 ) THEN
        INSERT INTO tbl_user_info_3 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 5 ) THEN
        INSERT INTO tbl_user_info_5 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 7 ) THEN
        INSERT INTO tbl_user_info_7 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 9 ) THEN
        INSERT INTO tbl_user_info_9 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 11 ) THEN
        INSERT INTO tbl_user_info_11 VALUES (NEW.*);
       ELSIF    ( mod(NEW.id,16) = 13 ) THEN
        INSERT INTO tbl_user_info_13 VALUES (NEW.*);
       ELSE
        INSERT INTO tbl_user_info_15 VALUES (NEW.*);
       END IF;
    ELSE
        RAISE EXCEPTION 'ID out of range.  Please fix the tbl_user_info_insert_trigger() function!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;




CREATE OR REPLACE FUNCTION tbl_user_info_delete_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( mod(OLD.id,2) = 0 ) THEN
       IF    ( mod(OLD.id,16) = 0 ) THEN
        DELETE FROM tbl_user_info_0 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 2 ) THEN
        DELETE FROM tbl_user_info_2 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 4 ) THEN
        DELETE FROM tbl_user_info_4 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 6 ) THEN
        DELETE FROM tbl_user_info_6 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 8 ) THEN
        DELETE FROM tbl_user_info_8 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 10 ) THEN
        DELETE FROM tbl_user_info_10 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 12 ) THEN
        DELETE FROM tbl_user_info_12 where id=OLD.id;
       ELSE
        DELETE FROM tbl_user_info_14 where id=OLD.id;
       END IF;
    ELSIF ( mod(OLD.id,2) = 1 ) THEN
       IF    ( mod(OLD.id,16) = 1 ) THEN
        DELETE FROM tbl_user_info_1 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 3 ) THEN
        DELETE FROM tbl_user_info_3 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 5 ) THEN
        DELETE FROM tbl_user_info_5 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 7 ) THEN
        DELETE FROM tbl_user_info_7 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 9 ) THEN
        DELETE FROM tbl_user_info_9 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 11 ) THEN
        DELETE FROM tbl_user_info_11 where id=OLD.id;
       ELSIF ( mod(OLD.id,16) = 13 ) THEN
        DELETE FROM tbl_user_info_13 where id=OLD.id;
       ELSE
        DELETE FROM tbl_user_info_15 where id=OLD.id;
       END IF;
    ELSE
        RAISE EXCEPTION 'ID out of range.  Please fix the tbl_user_info_delete_trigger() function!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

极端测试:
digoal=> insert into tbl_user_info select generate_series(0,16000000,16),'zhou','digoal','sky-mobi',27;
INSERT 0 0
Time: 35437.456 ms
digoal=> select count(*) from tbl_user_info_0;
  count 
---------
 1000001
(1 row)

Time: 105.315 ms
digoal=> truncate table tbl_user_info;
TRUNCATE TABLE
Time: 489.776 ms
digoal=> insert into tbl_user_info select generate_series(15,16000000,16),'zhou','digoal','sky-mobi',27;
INSERT 0 0
Time: 39364.435 ms
digoal=> select count(*) from tbl_user_info_15;
  count 
---------
 1000000
(1 row)

Time: 105.203 ms

最低消耗 = 35437.456 ms
最高消耗 = 39364.435 ms

很明显最高消耗下降了,最低消耗略有上升。符合算法的特性。

当然如果不使用触发器,将会降低10倍左右的开销。

所以一般不推荐在数据库端作分区的判断 , 对CPU的开销是比较大的。

后面两种算法就不再测试了,大家知道一下就好了。