Greenplum中多级分区表详解--3层分区

来源：互联网发布：菲律宾做网络推广经历编辑：程序博客网时间：2024/06/03 17:39

本文主要说明Greenplum中多级分区(此处为3层分区)的相关操作，包括创建分区表、数据的加载和查询、加载数据错误的方式、交换分区、添加和重新划分分区。

分区的层次并不是越多越好，应根据需要而选择适当的分区策略，此文主要说明相关的操作。

1，创建分区表

以下创建的一个3级分区表是在官方示例的基础上修改而来的。

testDB=# create table sales(id int, year int,season int, month int,region text)--顶级表名和字段属性

testDB-# with (appendonly=true, compresstype=quicklz,compresslevel=1, orientation=column)--可选的定义，表属性，压缩方式，压缩率，列存

testDB-# distributed by (id) --分布键

testDB-# partition by range(year) -- 主范围分区的字段为年

testDB-# subpartition by range(season) --1级范围子分区的字段为季节

testDB-# subpartition template( --1级范围子分区的策略

testDB(# start (1) end (5) every (1)) -- START值总是被包含而END值总是被排除

testDB-# subpartition by list(region) --2级列表子分区的字段为地区

testDB-# subpartition template( --2级列表子分区的策略

testDB(# subpartition usa values('usa'),

testDB(# subpartition europe values('europe'),

testDB(# subpartition asia values('asia'),

testDB(# default subpartition other_regions) --default默认分区

testDB-# (partition year2014 start (2014) inclusive, -- 主分区的分区策略必须放在最后，且分区名称必须以字母开头，此处就不能直接用2014.

testDB(# partition year2015 start(2015) inclusive end (2016)exclusive,

testDB(# default partition outlying_years);

NOTICE: CREATE TABLE will create partition"sales_1_prt_outlying_years" for table "sales"

...... --大量的建表信息

NOTICE: CREATE TABLE will create partition"sales_1_prt_year2015_2_prt_4_3_prt_other_regions" for table"sales_1_prt_year2015_2_prt_4"

CREATE TABLE

testDB=# \d+ sales

Append-OnlyColumnar Table "test.sales"

--------+---------+-----------+----------+------------------+-------------------+------------+-------

id |integer | | plain | quicklz | 1 | 32768 |

Checksum: t

Childtables: sales_1_prt_outlying_years,

sales_1_prt_year2014,

sales_1_prt_year2015

Has OIDs: no

Options:appendonly=true, compresstype=quicklz, compresslevel=1, orientation=column

Distributedby: (id)

testDB=# \d+ sales_1_prt_year2014 --可逐级查看相关信息

Append-Only ColumnarTable "test.sales_1_prt_year2014"

--------+---------+-----------+----------+------------------+-------------------+------------+-------------

id |integer | | plain | quicklz | 1 | 32768 |

Checksum: t

Checkconstraints:

"sales_1_prt_year2014_check"CHECK (year >= 2014 AND year < 2015)

Inherits:sales

Childtables: sales_1_prt_year2014_2_prt_1,

sales_1_prt_year2014_2_prt_2,

sales_1_prt_year2014_2_prt_3,

sales_1_prt_year2014_2_prt_4

Has OIDs: no

Options:appendonly=true, compresstype=quicklz, compresslevel=1, orientation=column

Distributedby: (id)

testDB=# select partitiontablename, partitionname, partitionlevel,partitionrank from pg_partitions where tablename='sales'; --查看表的分区,以下有3个主分区表，3*4=12个1级子分区表，12*4=48个2级子分区表，另加上顶级表'sales',此次共生成了64个表。

partitiontablename | partitionname | partitionlevel | partitionrank

--------------------------------------------------------+----------------+----------------+---------------

sales_1_prt_outlying_years | outlying_years| 0 |

sales_1_prt_year2014 |year2014 | 0 | 1

sales_1_prt_year2015 |year2015 | 0 | 2

sales_1_prt_outlying_years_2_prt_1 | | 1 | 1

sales_1_prt_outlying_years_2_prt_2 | | 1 | 2

sales_1_prt_outlying_years_2_prt_3 | | 1 | 3

sales_1_prt_outlying_years_2_prt_4 | | 1 | 4

sales_1_prt_year2014_2_prt_1 | | 1 | 1

sales_1_prt_year2014_2_prt_2 | | 1 | 2

sales_1_prt_year2014_2_prt_3 | | 1 | 3

sales_1_prt_year2014_2_prt_4 | | 1 | 4

sales_1_prt_year2015_2_prt_1 | | 1 | 1

sales_1_prt_year2015_2_prt_2 | | 1 | 2

sales_1_prt_year2015_2_prt_3 | | 1 | 3

sales_1_prt_year2015_2_prt_4 | | 1 | 4

sales_1_prt_outlying_years_2_prt_1_3_prt_usa | usa | 2 |

sales_1_prt_outlying_years_2_prt_1_3_prt_europe | europe | 2 |

sales_1_prt_outlying_years_2_prt_1_3_prt_asia | asia | 2 |

sales_1_prt_outlying_years_2_prt_1_3_prt_other_regions| other_regions | 2 |

sales_1_prt_outlying_years_2_prt_2_3_prt_usa | usa | 2 |

sales_1_prt_outlying_years_2_prt_2_3_prt_europe | europe | 2 |

sales_1_prt_outlying_years_2_prt_2_3_prt_asia | asia | 2 |

sales_1_prt_outlying_years_2_prt_2_3_prt_other_regions| other_regions | 2 |

sales_1_prt_outlying_years_2_prt_3_3_prt_usa | usa | 2 |

sales_1_prt_outlying_years_2_prt_3_3_prt_europe | europe | 2 |

sales_1_prt_outlying_years_2_prt_3_3_prt_asia | asia | 2 |

sales_1_prt_outlying_years_2_prt_3_3_prt_other_regions| other_regions | 2 |

sales_1_prt_outlying_years_2_prt_4_3_prt_usa | usa | 2 |

sales_1_prt_outlying_years_2_prt_4_3_prt_europe | europe | 2 |

sales_1_prt_outlying_years_2_prt_4_3_prt_asia | asia | 2 |

sales_1_prt_outlying_years_2_prt_4_3_prt_other_regions| other_regions | 2 |

sales_1_prt_year2014_2_prt_1_3_prt_usa | usa | 2 |

sales_1_prt_year2014_2_prt_1_3_prt_europe | europe | 2 |

sales_1_prt_year2014_2_prt_1_3_prt_asia | asia | 2 |

sales_1_prt_year2014_2_prt_1_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2014_2_prt_2_3_prt_usa | usa | 2 |

sales_1_prt_year2014_2_prt_2_3_prt_europe | europe | 2 |

sales_1_prt_year2014_2_prt_2_3_prt_asia | asia | 2 |

sales_1_prt_year2014_2_prt_2_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2014_2_prt_3_3_prt_usa | usa | 2 |

sales_1_prt_year2014_2_prt_3_3_prt_europe | europe | 2 |

sales_1_prt_year2014_2_prt_3_3_prt_asia | asia | 2 |

sales_1_prt_year2014_2_prt_3_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2014_2_prt_4_3_prt_usa | usa | 2 |

sales_1_prt_year2014_2_prt_4_3_prt_europe | europe | 2 |

sales_1_prt_year2014_2_prt_4_3_prt_asia | asia | 2 |

sales_1_prt_year2014_2_prt_4_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2015_2_prt_1_3_prt_usa | usa | 2 |

sales_1_prt_year2015_2_prt_1_3_prt_europe | europe | 2 |

sales_1_prt_year2015_2_prt_1_3_prt_asia | asia | 2 |

sales_1_prt_year2015_2_prt_1_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2015_2_prt_2_3_prt_usa | usa | 2 |

sales_1_prt_year2015_2_prt_2_3_prt_europe | europe | 2 |

sales_1_prt_year2015_2_prt_2_3_prt_asia | asia | 2 |

sales_1_prt_year2015_2_prt_2_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2015_2_prt_3_3_prt_usa | usa | 2 |

sales_1_prt_year2015_2_prt_3_3_prt_europe | europe | 2 |

sales_1_prt_year2015_2_prt_3_3_prt_asia | asia | 2 |

sales_1_prt_year2015_2_prt_3_3_prt_other_regions | other_regions | 2 |

sales_1_prt_year2015_2_prt_4_3_prt_usa | usa | 2 |

sales_1_prt_year2015_2_prt_4_3_prt_europe | europe | 2 |

sales_1_prt_year2015_2_prt_4_3_prt_asia | asia | 2 |

sales_1_prt_year2015_2_prt_4_3_prt_other_regions | other_regions | 2 |

(63 rows)

2，多级分区表的数据的载入和查询

多级分区表的数据都是放在最底层的表中的，此处即为2级子分区表。

当向分区表中载入数据时，可指定载入到2级子分区表，或直接载入到顶级表，其主分区表或1级子分区表，让其按分区字段自动分配到最底层的2级子分区表。

--尝试将下表data_sales数据载入到'sales'中

1|2014|1|3|usa

2|2015|2|6|usa

3|2016|3|9|europe

4|2014|4|12|asia

5|2015|1|2|africa

testDB=# copy sales from'/home/gpadmin/test/data_sales' with delimiter '|';

COPY 5 --数据自动按分区字段分到各个子分区

testDB=# select * from sales;

id | year | season| month | region

----+------+--------+-------+--------

4 | 2014 | 4 | 12 | asia

2 | 2015 | 2 | 6 | usa

3 | 2016 | 3 | 9 | europe

1 | 2014 | 1 | 3 | usa

5 | 2015 | 1 | 2 | africa

(5 rows)

testDB=# select * from sales whereyear=2014 and season=1 and region='usa';

id | year | season| month | region

----+------+--------+-------+--------

1 | 2014 | 1 | 3 | usa

(1 row)

testDB=# explain select * fromsales where year=2014 and season=1 and region='usa';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------

Gather Motion2:1 (slice1; segments: 2) (cost=0.00..9.35 rows=4 width=42)

-> Append (cost=0.00..9.35 rows=2 width=42)

-> Append-only Columnar Scan onsales_1_prt_outlying_years_2_prt_1_3_prt_usa sales (cost=0.00..0.00 rows=1 width=48)

Filter: year = 2014 AND season = 1 AND region = 'usa'::text

-> Append-only Columnar Scan on sales_1_prt_outlying_years_2_prt_1_3_prt_other_regionssales (cost=0.00..0.00 rows=1 width=48)

Filter: year = 2014 AND season = 1 AND region = 'usa'::text

-> Append-only Columnar Scan onsales_1_prt_year2014_2_prt_1_3_prt_other_regions sales (cost=0.00..8.33 rows=1 width=48)

Filter: year = 2014 AND season = 1 AND region = 'usa'::text

-> Seq Scan onsales_1_prt_year2014_2_prt_1_3_prt_usa sales (cost=0.00..1.02 rows=1 width=21)

Filter: year = 2014 AND season = 1 AND region = 'usa'::text

(10 rows)

--以上可以看出该查询扫描了4个2级子分区，除了sales_1_prt_year2014_2_prt_1_3_prt_usa该数据所在分区外，还扫描了组合出来的3个默认子分区，可见在有限制条件查询时，默认分区越多，查询消耗越多。

3，载入数据的错误方式

testDB=# copysales_1_prt_year2014_2_prt_1 from '/home/gpadmin/test/data_sales' withdelimiter '|';

ERROR: new row forrelation "sales_1_prt_year2014_2_prt_1" violates check constraint"sales_1_prt_year2014_2_prt_1_check" (seg0 sdw1:40000 pid=5976)

CONTEXT: COPYsales_1_prt_year2014_2_prt_1, line 3: "3|2016|3|9|europe"

--当要载入的数据不满足所要插入的分区的约束时，插入会失败。

testDB=# insert into sales_1_prt_year2014_2_prt_1values(1,2014,1,1,'usa');

ERROR: Directlymodifying a part of a partitioned table is disallowed

HINT: Modify thepartitioned table instead

--与copy不同的是insert只能插入到顶级表或最底层表，而不能插入到中间层表。

4，交换分区

testDB=# create table tmp1 (likesales); --建个中间表

NOTICE: Tabledoesn't have 'distributed by' clause, defaulting to distribution columns fromLIKE table

CREATE TABLE

准备数据data_sales2，

6|2014|1|1|usa

testDB=# copy tmp1 from'/home/gpadmin/test/data_sales2' with delimiter '|';

testDB=# alter table sales alterpartition for (rank(1)) alter partition for (rank(1)) exchange partition for('usa') with table tmp1;

NOTICE: exchangedpartition "usa" of partition for rank 1 of partition"year2014" of relation "sales" with relation "tmp1"

ALTER TABLE

--交换分区只能交换最底层的表，不管上面有多少层都用alter partition for (rank(partitionrank))，底层用exchange partition for (values)。

--alter partition for (rank(partitionrank))中的rank(partitionrank)改为values也是可以的.

testDB=# select * from sales;

id| year | season | month | region

----+------+--------+-------+--------

3 | 2016 | 3 | 9 | europe

5 | 2015 | 1 | 2 | africa

6 | 2014 | 1 | 1 | usa

4 | 2014 | 4 | 12 | asia

2 | 2015 | 2 | 6 | usa

(5 rows)

testDB=# select * from tmp1;

id| year | season | month | region

----+------+--------+-------+--------

1 | 2014 | 1 | 3 | usa

(1 row)

--由上可见交换分区将tmp1中id为6的数据加入到了sales表中，并换出了id为1的数据。

--交换所用的中间表中的数据必须满足交换分区的约束条件，从下面交换前后中间表的变化可看出，交换赋予了中间表一定的表属性和约束。

交换前：

testDB=# \d tmp1

Table"test.tmp1"

Column | Type | Modifiers

--------+---------+-----------

id | integer |

year | integer |

season | integer |

month | integer |

region | text |

Distributed by: (id)

交换后：

testDB=# \d tmp1;

Append-Only Columnar Table "test.tmp1"

Column | Type | Modifiers

--------+---------+-----------

id | integer |

year | integer |

season | integer |

month | integer |

region | text |

Checksum: t

Check constraints:

"sales_1_prt_year2014_2_prt_1_3_prt_usa_check" CHECK (region ='usa'::text)

"sales_1_prt_year2014_2_prt_1_check" CHECK (season >= 1 ANDseason < 2)

"sales_1_prt_year2014_check" CHECK (year >= 2014 AND year< 2015)

Distributed by: (id)

https://support.pivotal.io/hc/en-us/articles/203330727-HowTo-Exchange-Partitions-and-Subpartitions-with-example --该示例详细说明了有subpartition时交换分区的方法

5，添加和重新划分分区

默认分区是没有partitionrank的，在主分区出现默认分区时，会导致添加和重划分区不可行。

主分区是可以添加新分区的，但只有在单层分区时，拆分分区才有效。

对于子分区只有重新划分分区，因为对子分区的改变不会影响现有的子分区。

现删除主分区的默认分区，有需要的话，可以之后再添加默认分区。

testDB=# alter table sales dropdefault partition;

NOTICE: droppedpartition "outlying_years" for relation "sales" and itschildren

ALTER TABLE

--删掉了21张子表，现在sales中共有43张表。

主分区添加新分区：

testDB=# alter table sales addpartition year2016

start (2016) inclusive

end (2017) exclusive;

--共新建了21张子表，现在sales中共有64张表。

1级子分区重划分区，加1个分区：

testDB=# alter table sales setsubpartition template

testDB-# (start (1) end (6) every(1));

NOTICE: replacinglevel 1 subpartition template specification for relation "sales"

ALTER TABLE

--对1级子分区及以下的分区重划分区并不会对原有的表产生影响，它只会影响以后添加的分区。

testDB=# alter table sales addpartition year2017

start (2017) inclusive

end (2018) exclusive;

--共新建了26张子表，现在sales中共有90张表。

2级子分区重划分区，加1 个分区：

testDB=# alter table sales

testDB-# alter partition for(rank(1)) --这里的rank值，只要主分区partitionrank中有的都可以

testDB-# set subpartition template

testDB-# (subpartition usavalues('usa'),

testDB(# subpartition europevalues('europe'),

testDB(# subpartition asiavalues('asia'),

testDB(# subpartition africa values('africa'),

testDB(# default subpartitionother_regions)；

NOTICE: replacing level2 subpartition template specification for relation "sales"

ALTER TABLE

testDB=# alter table sales add partition year2018

start (2018) inclusive

end (2019) exclusive;

--共新建31张子表，现在sales中共有121张表。

其它关于多级分区的重命名、删除、清空的操作都比较简单，这里不再一一阐述。

--转载请注明出处blog.csdn.net/aabc012

0 0