PostgreSQL's statistics target and histogram_bounds

来源:互联网 发布:湖州市公务员网络学堂 编辑:程序博客网 时间:2024/05/16 06:02

PostgreSQL 优化器对于数据库执行计划的产生是好是坏依赖统计信息的准确性。很多关系数据库都有这么一套体系来保障优化器选择的执行计划是最优的,如ORACLE也有统计信息表,mongoDB的话是所有的执行计划都执行一遍,第一个执行完后其他的终止执行,以后就是要执行最快的那个执行计划作为执行计划.有兴趣的朋友可以参看我在MONGODB板块写的博客。

    统计信息的生成实际上是取样的过程,那么哪些参数影响了PostgreSQL的统计信息呢。
    1.  default_statistics_target参数的值 ,决定了数据库默认的取样粒度。
    2.  COLUMN上面设置的 statistics 值 , 如果设置了这个值,将覆盖上面的默认值。决定了该列的取样粒度。

下面来举例说明:
# 查看默认值
postgres=# select * from pg_settings where name = 'default_statistics_target';
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------
name       | default_statistics_target
setting    | 100
unit       |
category   | Query Tuning / Other Planner Options
short_desc | Sets the default statistics target.
extra_desc | This applies to table columns that have not had a column-specific target set via ALTER TABLE SET STATISTICS.
context    | user
vartype    | integer
source     | default
min_val    | 1
max_val    | 10000
enumvals   |
boot_val   | 100
reset_val  | 100
sourcefile |
sourceline |
#值为100,也就是说取样粒度=100

# 生成一个测试表:
digoal=> /d tbl_user
                                  Table "digoal.tbl_user"
  Column   |         Type          |                       Modifiers                      
-----------+-----------------------+-------------------------------------------------------
 id        | bigint                | not null default nextval('tbl_user_id_seq'::regclass)
 firstname | character varying(32) |
 lastname  | character varying(32) |
 corp      | character varying(32) |
 age       | smallint              |
Indexes:
    "tbl_user_pkey" PRIMARY KEY, btree (id)

# 插入1000条测试记录
digoal=> truncate table tbl_user ;
TRUNCATE TABLE
digoal=> insert into tbl_user (id,firstname,lastname,corp,age) select generate_series(1,1000),'zhou','digoal','sky-mobi',27;
INSERT 0 1000

# 分析表,获得取样信息
digoal=> analyze tbl_user;
ANALYZE

# 查看统计信息
digoal=> select * from pg_stats where tablename='tbl_user';
-[ RECORD 1 ]-----+-----------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------
schemaname        | digoal
tablename         | tbl_user
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 8
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  | {1,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,30
0,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,63
0,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,96
0,970,980,990,1000}
correlation       | 1
-[ RECORD 2 ]-----+-----------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------
schemaname        | digoal
tablename         | tbl_user
attname           | lastname
inherited         | f
null_frac         | 0
avg_width         | 7
n_distinct        | 1
most_common_vals  | {digoal}
most_common_freqs | {1}
histogram_bounds  |
correlation       | 1
-[ RECORD 3 ]-----+-----------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------
schemaname        | digoal
tablename         | tbl_user
attname           | corp
inherited         | f
null_frac         | 0
avg_width         | 9
n_distinct        | 1
most_common_vals  | {sky-mobi}
most_common_freqs | {1}
histogram_bounds  |
correlation       | 1
-[ RECORD 4 ]-----+-----------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------
schemaname        | digoal
tablename         | tbl_user
attname           | age
inherited         | f
null_frac         | 0
avg_width         | 2
n_distinct        | 1
most_common_vals  | {27}
most_common_freqs | {1}
histogram_bounds  |
correlation       | 1
-[ RECORD 5 ]-----+-----------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------
schemaname        | digoal
tablename         | tbl_user
attname           | firstname
inherited         | f
null_frac         | 0
avg_width         | 5
n_distinct        | 1
most_common_vals  | {zhou}
most_common_freqs | {1}
histogram_bounds  |
correlation       | 1

# tbl_user 表有几列,这里就应该有几条记录,我们看attname=id的记录的histogram_bounds信息如下,由于这个表的ID列目前的STATISTICS继承了系统的默认值100,因此可以看到这个array值有100个。
histogram_bounds  | {1,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,30
0,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,63
0,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,96
0,970,980,990,1000}
# n_distinct        | -1表示该列的值没有重复,是唯一的

# 再看一个LASTNAME列的统计信息
most_common_vals  | {digoal}
most_common_freqs | {1}
最常出现的值为digoal,频率为100%

其他的统计维度就不一一介绍了,可以参考我以前的博客.

# 下面把column=id的statistics改为1000看看是不是对应的histogram_bounds统计信息将变更。
digoal=> alter table tbl_user alter column id set STATISTICS 1000;
ALTER TABLE
digoal=> analyze tbl_user;
ANALYZE
select histogram_bounds from pg_stats where tablename='tbl_user' and attname='id';
# 结果如预期,已经修改为1000个取样间隔.
histogram_bounds  | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,
85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121
,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154
,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187
,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220
,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253
,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286
,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319
,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352
,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385
,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418
,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451
,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484
,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517
,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550
,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583
,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616
,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649
,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682
,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715
,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748
,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781
,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814
,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847
,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880
,881,882,883,884,885,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,911,912,913
,914,915,916,917,918,919,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,943,944,945,946
,947,948,949,950,951,952,953,954,955,956,957,958,959,960,961,962,963,964,965,966,967,968,969,970,971,972,973,974,975,976,977,978,979
,980,981,982,983,984,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999,1000}

# 最后,结合从pg_class取到的tuples值可以推测出每个范围内的值的个数,如下:
digoal=> select reltuples from pg_class where relname='tbl_user';
-[ RECORD 1 ]---
reltuples | 1000

这个表有1000个值,根据statistics=100时的柱状统计信息
histogram_bounds  | {1,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,30
0,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,63
0,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,96
0,970,980,990,1000}
可以得出每个段之间的记录数=1000/100 =10条记录, 如,1-10为10条记录.


如何取出当前列的STATISTICS值:
mpc_db_account=> select oid from pg_class where relname='tbl_test';
  oid 
-------
 18659
(1 row)
mpc_db_account=> select * from pg_attribute where attname='id' and attrelid=18659;
 attrelid | attname | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyval | attstorage | atta
lign | attnotnull | atthasdef | attisdropped | attislocal | attinhcount | attacl | attoptions
----------+---------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+-----
-----+------------+-----------+--------------+------------+-------------+--------+------------
    18659 | id      |       23 |            -1 |      4 |      1 |        0 |          -1 |        -1 | t        | p          | i  
     | f          | f         | f            | t          |           0 |        |
(1 row)
attstattarget = -1表示默认值
mpc_db_account=> alter table tbl_test alter column id set statistics 10;
ALTER TABLE
mpc_db_account=> select * from pg_attribute where attname='id' and attrelid=18659;
 attrelid | attname | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyval | attstorage | atta
lign | attnotnull | atthasdef | attisdropped | attislocal | attinhcount | attacl | attoptions
----------+---------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+-----
-----+------------+-----------+--------------+------------+-------------+--------+------------
    18659 | id      |       23 |            10 |      4 |      1 |        0 |          -1 |        -1 | t        | p          | i  
     | f          | f         | f            | t          |           0 |        |
(1 row)
修改后
attstattarget = 10

原创粉丝点击