PostgreSQL 1000亿数据量 正则匹配 速度与激情

来源:互联网 发布:上海恺英网络多少人 编辑:程序博客网 时间:2024/04/30 01:13

摘要: 承接上一篇https://yq.aliyun.com/articles/7444 测试环境为 8台主机(16 core/主机)的PostgreSQL集群,一共240个数据节点,单表数据量1008亿。性能图表 : 如果要获得更快的响应速度,可以通过增加主机和节点数(或者通过增加CPU和节点数),缩

承接上一篇
https://yq.aliyun.com/articles/7444

测试环境为 8台主机(16c/host)的 PostgreSQL集群,一共240个数据节点,测试数据量1008亿。
性能图表 :
_
如果要获得更快的响应速度,可以通过增加主机和节点数(或者通过增加CPU和节点数),缩短recheck的处理时间。

数据生成方法:

#!/bin/bash  #      截取通过random()计算得到的MD5 128bit hex的前48bit, 转成字符串,得到[0-9]和[a-f]组成的12个随机字符串。  psql digoal digoal -c "create table t_regexp_100billion distributed randomly"  for ((i=1;i<=1008;i++))  do    psql digoal digoal -c "copy (select substring(md5(random()::text),1,12) from generate_series(1,100000000)) to stdout" | psql digoal digoal -c "copy t_regexp_100billion from stdin"  done  psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_1 on t_regexp_100billion(info)"  psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_2 on t_regexp_100billion(reverse(info))"  psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_gin on t_regexp_100billion using gin (info gin_trgm_ops)"  

数据概貌

digoal=> select count(*) from t_regexp_100billion ;      count       --------------   100800000000  (1 row)  Time: 228721.386 ms  

表大小

digoal=> \dt+ t_regexp_100billion                              List of relations   Schema |        Name         | Type  | Owner  |  Size   | Description   --------+---------------------+-------+--------+---------+-------------   public | t_regexp_100billion | table | digoal | 4158 GB |   (1 row)  

索引大小

idx_t_regexp_100billion_1     2961 GB  idx_t_regexp_100billion_1     2961 GB  idx_t_regexp_100billion_gin   2300 GB  

测试数据展示:

digoal=> select * from t_regexp_100billion offset 1000000 limit 10;       info       --------------   bca0fb45367e   3051ca8a9a38   fadc91a3a4de   710b9c60417e   279dd9832cc3   f4743fe2e83b   9ce9e42d4039   65e64742fd3f   db3d0e0edc52   7cfb00bb38ec  (10 rows)  

重复度取样, 计算random() md5得到的字符串,可以确保非常低的重复度:

digoal=> select count(distinct info) from (select * from t_regexp_100billion offset 1299422811 limit 1000000) t;   count    --------   999750  (1 row)  

统计信息展示:

digoal=> alter table t_regexp_100billion alter column info set statistics 10000;  ALTER TABLE  digoal=> analyze t_regexp_100billion ;  ANALYZE  schemaname             | public  tablename              | t_regexp_100billion  attname                | info  inherited              | f  null_frac              | 0  avg_width              | 13  n_distinct             | -0.836834             # 采样统计信息,约83.6834%的唯一值  most_common_vals       | (pg_catalog.text){7f68d12d2205,00083380706d,00154b6d79e8,...    most_common_freqs      | {1e-06,6.66667e-07,6.66667e-07,6.66667e-07,.....        单个最高频值的占比为1e-06, 也就是说1000亿记录中出现10万次。  histogram_bounds       | (pg_catalog.text){0000008123b7,00066c71c9bb,000d672de234,...  correlation            | 0.000237291  most_common_elems      |   most_common_elem_freqs |   elem_count_histogram   |   

7f68d12d2205 实际的出现次数,可能是采样时7f68d12d2205被采样到的块较多,所以数据库认为它的占比较多:

digoal=> select count(*) from t_regexp_100billion where info='7f68d12d2205';  -[ RECORD 1 ]  count | 54  digoal=> select ctid from t_regexp_100billion where info='7f68d12d2205' order by 1;       ctid        ---------------   (15343,114)   (62134,39)   (96808,112)   (116492,176)   (194615,143)   (328074,116)   (364037,115)   (375240,158)   (376187,152)   (602144,81)   (664026,6)   (689501,136)   (695345,130)   (697374,126)   (714719,148)   (743169,20)   (802326,139)   (833830,41)   (839417,185)   (892417,78)   (892493,149)   (907979,52)   (967078,163)   (990313,159)   (1007998,27)   (1106961,57)   (1142731,165)   (1148427,67)   (1156654,156)   (1205854,137)   (1243429,68)   (1277287,165)   (1328836,98)   (1331727,150)   (1337534,3)   (1360947,104)   (1438970,97)   (1476941,22)   (1482022,82)   (1486307,69)   (1548445,155)   (1557209,82)   (1564980,158)   (1646685,76)   (1663018,99)   (1678604,77)   (1755845,177)   (1981937,153)   (1984723,98)   (2071955,59)   (2093147,149)   (2199794,102)   (2204957,44)   (2234820,142)  (54 rows)  

性能测试:
前缀匹配查询速度:

digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';       ctid      | tableoid |     info       ---------------+----------+--------------   (124741,60)   |    16677 | 80ebcdd47006   (896121,64)   |    16659 | 80ebcdd47006   (1124495,97)  |    16659 | 80ebcdd47006   (1126474,141) |    16659 | 80ebcdd47006   (1059471,62)  |    16659 | 80ebcdd47006   (1296562,115) |    16659 | 80ebcdd47006   (1190941,122) |    16659 | 80ebcdd47006   (680853,129)  |    16659 | 80ebcdd47006   (1010667,15)  |    16659 | 80ebcdd47006   (1386348,25)  |    16659 | 80ebcdd47006   (1522827,90)  |    16659 | 80ebcdd47006   (2204071,129) |    16659 | 80ebcdd47006   (1570431,114) |    16659 | 80ebcdd47006   (888185,38)   |    16659 | 80ebcdd47006   (605886,160)  |    16659 | 80ebcdd47006   (1306061,123) |    16659 | 80ebcdd47006   (757157,47)   |    16659 | 80ebcdd47006   (1166290,83)  |    16659 | 80ebcdd47006   (419730,1)    |    16659 | 80ebcdd47006   (1833853,131) |    16659 | 80ebcdd47006   (964866,120)  |    16659 | 80ebcdd47006   (904961,175)  |    16659 | 80ebcdd47006   (984373,32)   |    16659 | 80ebcdd47006   (891018,145)  |    16659 | 80ebcdd47006   (1520483,121) |    16659 | 80ebcdd47006   (571001,124)  |    16659 | 80ebcdd47006   (802093,55)   |    16659 | 80ebcdd47006   (6831,172)    |    16659 | 80ebcdd47006   (1169137,84)  |    16659 | 80ebcdd47006   (77398,164)   |    16659 | 80ebcdd47006   (24132,98)    |    16659 | 80ebcdd47006   (564322,152)  |    16659 | 80ebcdd47006   (357087,172)  |    16659 | 80ebcdd47006   (1823628,60)  |    16659 | 80ebcdd47006   (2153609,52)  |    16659 | 80ebcdd47006   (816401,140)  |    16659 | 80ebcdd47006   (542383,53)   |    16662 | 80ebcdd47006   (1340971,64)  |    16662 | 80ebcdd47006   (1239166,108) |    16662 | 80ebcdd47006   (2033648,39)  |    16662 | 80ebcdd47006   (1890808,93)  |    16662 | 80ebcdd47006   (1213124,4)   |    16662 | 80ebcdd47006   (1025184,106) |    16662 | 80ebcdd47006   (620238,131)  |    16662 | 80ebcdd47006   (583064,74)   |    16662 | 80ebcdd47006   (1454680,42)  |    16671 | 80ebcdd47006   (417385,74)   |    16671 | 80ebcdd47006   (323669,61)   |    16671 | 80ebcdd47006   (1759181,138) |    16671 | 80ebcdd47006   (2112157,146) |    16671 | 80ebcdd47006   (431326,92)   |    16671 | 80ebcdd47006   (2097356,110) |    16671 | 80ebcdd47006  (52 rows)  Time: 3226.393 ms  digoal=> explain (analyze,verbose,buffers,costs,timing) select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';   Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=3085.502..3112.273 rows=52 loops=1)     Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info     Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,   h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9     Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ '^80ebcdd47'::text)   Planning time: 0.061 ms   Execution time: 3112.296 ms  (6 rows)  Time: 3139.928 ms  

后缀匹配查询速度

digoal=> select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';       ctid      | tableoid |     info       ---------------+----------+--------------   (124741,26)   |    16677 | f3b98021d24f   (1696888,151) |    16659 | f3b98021d24f   (1278911,101) |    16659 | f3b98021d24f   (1427480,157) |    16659 | f3b98021d24f   (449192,30)   |    16659 | f3b98021d24f   (1833887,81)  |    16659 | f3b98021d24f   (229525,72)   |    16659 | f3b98021d24f   (1353789,17)  |    16659 | f3b98021d24f   (1875911,148) |    16659 | f3b98021d24f   (1847078,35)  |    16659 | f3b98021d24f   (316780,156)  |    16659 | f3b98021d24f   (1265453,120) |    16659 | f3b98021d24f   (100075,60)   |    16659 | f3b98021d24f   (1924176,2)   |    16659 | f3b98021d24f   (279583,2)    |    16659 | f3b98021d24f   (1631226,23)  |    16659 | f3b98021d24f   (1906666,50)  |    16659 | f3b98021d24f   (1640803,116) |    16659 | f3b98021d24f   (629651,46)   |    16659 | f3b98021d24f   (134982,13)   |    16659 | f3b98021d24f   (380660,123)  |    16659 | f3b98021d24f   (2158193,31)  |    16659 | f3b98021d24f   (324901,64)   |    16659 | f3b98021d24f   (1243973,160) |    16659 | f3b98021d24f   (540958,139)  |    16659 | f3b98021d24f   (441475,99)   |    16659 | f3b98021d24f   (1207114,121) |    16659 | f3b98021d24f   (574598,21)   |    16659 | f3b98021d24f   (1253283,185) |    16659 | f3b98021d24f   (1396717,142) |    16659 | f3b98021d24f   (149738,9)    |    16659 | f3b98021d24f   (764749,26)   |    16659 | f3b98021d24f   (1211899,5)   |    16659 | f3b98021d24f   (1626746,65)  |    16659 | f3b98021d24f   (1342895,124) |    16659 | f3b98021d24f   (733794,136)  |    16659 | f3b98021d24f   (417796,2)    |    16659 | f3b98021d24f   (555520,163)  |    16659 | f3b98021d24f   (232038,105)  |    16659 | f3b98021d24f   (355107,127)  |    16659 | f3b98021d24f   (352143,175)  |    16662 | f3b98021d24f   (1856293,69)  |    16662 | f3b98021d24f   (1405106,105) |    16662 | f3b98021d24f   (47689,79)    |    16662 | f3b98021d24f   (679310,7)    |    16671 | f3b98021d24f   (1076234,164) |    16671 | f3b98021d24f  (46 rows)  Time: 3140.835 ms  digoal=> explain (verbose,costs,timing,buffers,analyze) select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';   Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=3085.738..3112.216 rows=46 loops=1)     Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info     Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,   h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9     Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (reverse(info) ~ '^f42d12089b'::text)   Planning time: 0.063 ms   Execution time: 3112.236 ms  (6 rows)  Time: 3139.890 ms  

前后模糊查询速度:

digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ 'e7add04871';       ctid      | tableoid |     info       ---------------+----------+--------------   (124741,45)   |    16677 | be7add048713   (49315,69)    |    16659 | be7add048713   (1770876,21)  |    16659 | be7add048713   (199079,143)  |    16659 | be7add048713   (151110,141)  |    16659 | be7add048713   (1597384,137) |    16659 | be7add048713   (1693453,25)  |    16659 | be7add048713   (101576,132)  |    16659 | be7add048713   (1110249,50)  |    16659 | be7add048713   (792326,68)   |    16659 | be7add048713   (1676705,68)  |    16659 | be7add048713   (1269148,101) |    16659 | be7add048713   (1027442,113) |    16659 | be7add048713   (1078144,100) |    16659 | be7add048713   (584038,141)  |    16659 | be7add048713   (1245454,80)  |    16659 | be7add048713   (1551184,102) |    16659 | be7add048713   (1326266,17)  |    16659 | be7add048713   (432025,101)  |    16659 | be7add048713   (300650,152)  |    16659 | be7add048713   (1322140,15)  |    16662 | be7add048713   (1424768,25)  |    16662 | be7add048713   (391150,31)   |    16662 | be7add048713   (254014,170)  |    16662 | be7add048713   (1758616,85)  |    16662 | be7add048713   (1720990,105) |    16662 | be7add048713   (345908,68)   |    16662 | be7add048713   (1592333,102) |    16662 | be7add048713   (1843902,130) |    16671 | be7add048713   (898136,121)  |    16671 | be7add048713   (1469985,138) |    16671 | be7add048713   (1287666,51)  |    16671 | be7add048713  (32 rows)  Time: 4970.662 ms  digoal=> explain (analyze,verbose,timing,costs,buffers) select ctid,tableoid,info from t_regexp_100billion where info ~ 'e7add04871';   Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=4712.916..4897.512 rows=32 loops=1)     Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info     Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,   h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9     Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ 'e7add04871'::text)   Planning time: 0.063 ms   Execution time: 4897.532 ms  (6 rows)  Time: 4925.741 ms  

正则匹配查询速度

digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ '.3918.209f';       ctid      | tableoid |     info       ---------------+----------+--------------   (124741,29)   |    16677 | 0b39188209f2   (1443707,79)  |    16659 | 0b39188209f2   (596962,50)   |    16659 | 0b39188209f2   (1763787,145) |    16659 | 0b39188209f2   (2192691,24)  |    16659 | 0b39188209f2   (425121,26)   |    16659 | 0b39188209f2   (2157735,117) |    16659 | 0b39188209f2   (826685,32)   |    16659 | 0b39188209f2   (507417,51)   |    16659 | 0b39188209f2   (1168854,22)  |    16659 | 0b39188209f2   (178112,96)   |    16659 | 0b39188209f2   (1609343,84)  |    16659 | 0b39188209f2   (1883190,161) |    16659 | 0b39188209f2   (1879921,82)  |    16659 | 0b39188209f2   (187722,148)  |    16659 | 0b39188209f2   (411680,31)   |    16659 | 0b39188209f2   (1103474,74)  |    16659 | 0b39188209f2   (1756318,139) |    16659 | 0b39188209f2   (760475,112)  |    16659 | 0b39188209f2   (656098,47)   |    16659 | 0b39188209f2   (2015224,31)  |    16659 | 0b39188209f2   (401158,64)   |    16659 | 0b39188209f2   (1001315,155) |    16659 | 0b39188209f2   (527643,24)   |    16659 | 0b39188209f2   (51198,95)    |    16659 | 0b39188209f2   (1709591,26)  |    16659 | 0b39188209f2   (1235618,22)  |    16659 | 0b39188209f2   (542813,107)  |    16659 | 0b39188209f2   (331468,156)  |    16659 | 0b39188209f2   (940954,68)   |    16662 | 0b39188209f2   (1295686,169) |    16662 | 0b39188209f2   (825955,109)  |    16668 | 0b39188209f2   (2025210,165) |    16671 | 0b39188209f2   (1639115,139) |    16671 | 0b39188209f2   (422678,79)   |    16671 | 0b39188209f2   (379949,175)  |    16671 | 0b39188209f2   (455206,96)   |    16671 | 0b39188209f2   (1745081,184) |    16671 | 0b39188209f2  (38 rows)  Time: 3580.536 ms  digoal=> explain (verbose,analyze,timing,costs,buffers) select ctid,tableoid,info from t_regexp_100billion where info ~ '.3918.209f';   Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=3407.156..3621.601 rows=38 loops=1)     Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info     Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,   h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9     Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ '.3918.209f'::text)   Planning time: 0.072 ms   Execution time: 3621.626 ms  (6 rows)  Time: 3650.045 ms  digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ 'ab2..d[1|f]3c8';       ctid      | tableoid |     info       ---------------+----------+--------------   (899065,160)  |    16659 | 4eab207df3c8   (2100060,157) |    16659 | a2ab2fbdf3c8   (162213,12)   |    16659 | a2ab2fbdf3c8   (637030,50)   |    16659 | 4eab207df3c8   (1325830,35)  |    16659 | a2ab2fbdf3c8   (197454,129)  |    16659 | 4eab207df3c8   (2000258,158) |    16659 | a2ab2fbdf3c8   (765698,19)   |    16659 | a2ab2fbdf3c8   (935743,59)   |    16659 | 4eab207df3c8   (2203339,96)  |    16659 | a2ab2fbdf3c8   (701234,118)  |    16659 | a2ab2fbdf3c8   (971717,57)   |    16659 | a2ab2fbdf3c8   (1164498,54)  |    16659 | 4eab207df3c8   (393227,147)  |    16659 | 4eab207df3c8   (1439445,94)  |    16659 | a2ab2fbdf3c8   (1549135,146) |    16659 | 4eab207df3c8   (1551991,36)  |    16659 | 4eab207df3c8   (2206488,3)   |    16659 | a2ab2fbdf3c8   (481614,118)  |    16659 | 4eab207df3c8   (1809085,7)   |    16659 | a2ab2fbdf3c8   (173214,139)  |    16659 | 4eab207df3c8   (1021816,28)  |    16659 | 4eab207df3c8   (829846,43)   |    16659 | a2ab2fbdf3c8   (1899020,79)  |    16659 | 4eab207df3c8   (6241,163)    |    16659 | 4eab207df3c8   (1205920,5)   |    16659 | a2ab2fbdf3c8   (412014,52)   |    16659 | 4eab207df3c8   (1122051,14)  |    16659 | 4eab207df3c8   (284493,87)   |    16659 | 4eab207df3c8   (374322,83)   |    16659 | a2ab2fbdf3c8   (189124,19)   |    16659 | 4eab207df3c8   (747428,175)  |    16659 | a2ab2fbdf3c8   (795035,152)  |    16659 | a2ab2fbdf3c8   (1949396,25)  |    16659 | a2ab2fbdf3c8   (154445,167)  |    16659 | a2ab2fbdf3c8   (859513,82)   |    16659 | a2ab2fbdf3c8   (31337,41)    |    16659 | a2ab2fbdf3c8   (1393343,136) |    16659 | 4eab207df3c8   (63555,82)    |    16659 | a2ab2fbdf3c8   (608980,177)  |    16659 | 4eab207df3c8   (250484,31)   |    16659 | a2ab2fbdf3c8   (1696502,87)  |    16659 | 4eab207df3c8   (2021326,68)  |    16659 | a2ab2fbdf3c8   (397967,70)   |    16659 | a2ab2fbdf3c8   (2083071,101) |    16659 | a2ab2fbdf3c8   (98554,23)    |    16659 | a2ab2fbdf3c8   (1247891,182) |    16659 | 4eab207df3c8   (1533143,51)  |    16659 | a2ab2fbdf3c8   (1280652,28)  |    16659 | a2ab2fbdf3c8   (1337921,119) |    16659 | a2ab2fbdf3c8   (446914,180)  |    16659 | a2ab2fbdf3c8   (1810263,161) |    16659 | a2ab2fbdf3c8   (350272,51)   |    16659 | 4eab207df3c8   (909148,37)   |    16659 | 4eab207df3c8   (197153,108)  |    16659 | a2ab2fbdf3c8   (207423,96)   |    16659 | 4eab207df3c8   (1097934,22)  |    16659 | a2ab2fbdf3c8   (12605,49)    |    16659 | 4eab207df3c8   (65244,28)    |    16659 | 4eab207df3c8   (10274,30)    |    16659 | a2ab2fbdf3c8   (1547771,91)  |    16659 | 4eab207df3c8   (55044,64)    |    16659 | 4eab207df3c8   (1286116,136) |    16659 | 4eab207df3c8   (797831,10)   |    16659 | a2ab2fbdf3c8   (450949,98)   |    16659 | a2ab2fbdf3c8   (563308,46)   |    16659 | 4eab207df3c8   (1815443,179) |    16659 | a2ab2fbdf3c8   (279403,105)  |    16659 | 4eab207df3c8   (1953284,11)  |    16659 | 4eab207df3c8   (2068896,15)  |    16659 | 4eab207df3c8   (1230212,18)  |    16659 | 4eab207df3c8   (1513277,18)  |    16659 | 4eab207df3c8   (1675223,30)  |    16659 | a2ab2fbdf3c8   (966609,80)   |    16662 | a2ab2fbdf3c8   (118085,180)  |    16662 | a2ab2fbdf3c8   (1557051,116) |    16662 | a2ab2fbdf3c8   (1848877,62)  |    16662 | 4eab207df3c8   (2224775,3)   |    16662 | 4eab207df3c8   (1196571,72)  |    16662 | 4eab207df3c8   (1799448,154) |    16662 | 4eab207df3c8   (2246230,68)  |    16662 | a2ab2fbdf3c8   (984529,120)  |    16662 | a2ab2fbdf3c8   (1361482,97)  |    16662 | 4eab207df3c8   (1935512,51)  |    16662 | a2ab2fbdf3c8   (816119,95)   |    16662 | a2ab2fbdf3c8   (770381,45)   |    16662 | 4eab207df3c8   (1943960,146) |    16662 | a2ab2fbdf3c8   (346006,160)  |    16671 | a2ab2fbdf3c8   (1873262,96)  |    16671 | 4eab207df3c8   (1219041,118) |    16671 | a2ab2fbdf3c8   (418076,24)   |    16671 | a2ab2fbdf3c8   (724463,28)   |    16671 | a2ab2fbdf3c8   (1471492,164) |    16671 | a2ab2fbdf3c8   (975490,122)  |    16671 | a2ab2fbdf3c8   (1885629,34)  |    16671 | 4eab207df3c8  (95 rows)  Time: 4718.459 ms  digoal=> explain (verbose,timing,costs,buffers,analyze) select ctid,tableoid,info from t_regexp_100billion where info ~ 'ab2..d[1|f]3c8';   Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=4386.010..4648.614 rows=95 loops=1)     Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info     Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,   h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9     Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ 'ab2..d[1|f]3c8'::text)   Planning time: 0.058 ms   Execution time: 4648.638 ms  (6 rows)  Time: 4676.919 ms  
本文为云栖社区原创内容,未经允许不得转载,如需转载请发送邮件至yqeditor@list.alibaba-inc.com
0 0
原创粉丝点击