Why "INSERT ... ON CONFLICT DO NOTHING/UPDATE" Good for performance
来源:互联网 发布:pkpm加密狗 淘宝 编辑:程序博客网 时间:2024/06/05 03:52
Postgres2015全国用户大会将于11月20至21日在北京丽亭华苑酒店召开。本次大会嘉宾阵容强大,国内顶级PostgreSQL数据库专家将悉数到场,并特邀欧洲、俄罗斯、日本、美国等国家和地区的数据库方面专家助阵:
- Postgres-XC项目的发起人铃木市一(SUZUKI Koichi)
- Postgres-XL的项目发起人Mason Sharp
- pgpool的作者石井达夫(Tatsuo Ishii)
- PG-Strom的作者海外浩平(Kaigai Kohei)
- Greenplum研发总监姚延栋
- 周正中(德哥), PostgreSQL中国用户会创始人之一
- 汪洋,平安科技数据库技术部经理
- ……
这个特性有什么好处呢?
如果你的应用程序不做保护,直接插入,可能会导致大量的违反唯一约束的错误,这种错误除了会写数据库日志,还会带来一定的问题。
例子:
postgres=# create table uk_test(id int primary key,info text,crt_time timestamp);CREATE TABLEpostgres=# insert into uk_test values (1,'test',now());INSERT 0 1
下面插入几个违反唯一约束的报错。
postgres=# insert into uk_test values (1,'test',now());ERROR: duplicate key value violates unique constraint "uk_test_pkey"DETAIL: Key (id)=(1) already exists.postgres=# insert into uk_test values (1,'test',now());ERROR: duplicate key value violates unique constraint "uk_test_pkey"DETAIL: Key (id)=(1) already exists.postgres=# insert into uk_test values (1,'test',now());ERROR: duplicate key value violates unique constraint "uk_test_pkey"DETAIL: Key (id)=(1) already exists.
以上操作实际上已经形成了HEAP tuple,但是没有形成index linepoint,同时在clog中标记为事务失败。
这些其实是有负面影响的,
1. 消耗了事务号,我们知道事务号是一个UINT32的整型,每20亿必须做一次frozen。所以这种回滚事务实际上是浪费事务号的。
2. 浪费XLOG空间,虽然事务回滚了,但是同样形成了块的变更,在XLOG中也是有记录的。
3. 浪费了空间,同时还需要垃圾回收进程来回收它。
看看ctid就明白了。
postgres=# insert into uk_test values (2,'test',now());INSERT 0 1postgres=# select ctid,* from uk_test ;ctid | id | info | crt_time-------+----+------+----------------------------(0,1) | 1 | test | 2015-06-23 15:00:33.452243(0,5) | 2 | test | 2015-06-23 15:00:37.431145(2 rows)
使用on conflict ... do nothing;可以有效的避免垃圾的产生:
postgres=# truncate uk_test ;TRUNCATE TABLEpostgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do nothing;INSERT 0 1postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do nothing;INSERT 0 0postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do nothing;INSERT 0 0postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do nothing;INSERT 0 0postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do nothing;INSERT 0 0postgres=# insert into uk_test values (2,'test',now());INSERT 0 1postgres=# select ctid,* from uk_test ;ctid | id | info | crt_time-------+----+--------+----------------------------(0,1) | 1 | digoal | 2015-06-23 15:00:59.137141(0,2) | 2 | test | 2015-06-23 15:01:05.143134(2 rows)
从上面的ctid可以看到,没有产生任何垃圾。
使用on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;,提供了原子操作,减少了应用和数据库的交互,对性能提升也是非常明显的。
更新操作和普通的更新没有区别,产生新的版本。
postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;INSERT 0 1postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;INSERT 0 1postgres=# select ctid,* from uk_test ;ctid | id | info | crt_time-------+----+--------+----------------------------(0,2) | 2 | test | 2015-06-23 15:01:05.143134(0,4) | 1 | digoal | 2015-06-23 15:01:27.715183(2 rows)postgres=# insert into uk_test values (1,'digoal',now()) on conflict on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;INSERT 0 1postgres=# select ctid,* from uk_test ;ctid | id | info | crt_time-------+----+--------+----------------------------(0,2) | 2 | test | 2015-06-23 15:01:05.143134(0,5) | 1 | digoal | 2015-06-23 15:01:34.485177(2 rows)postgres=# update uk_test set info='new',crt_time=now() where id=1;UPDATE 1postgres=# select ctid,* from uk_test ;ctid | id | info | crt_time-------+----+------+----------------------------(0,2) | 2 | test | 2015-06-23 15:01:05.143134(0,6) | 1 | new | 2015-06-23 15:02:12.968176(2 rows)
性能测试对比:
传统的最靠谱的存在则更新,不存在则插入的方法:
postgres=# create or replace function ins_update (v_id int) returns void as $$declarebeginperform 1 from uk_test where id=v_id limit 1;if found thenupdate uk_test set info='test',crt_time=now() where id=v_id;elseinsert into uk_test values(v_id,'test',now());end if;exception when others thenreturn;end;$$ language plpgsql;CREATE FUNCTION性能:postgres=# select pg_stat_reset_single_table_counters('uk_test'::regclass);postgres=# truncate uk_test ;postgres=# checkpoint;pg95@db-172-16-3-150-> vi test.sql\setrandom id 1 5000000select ins_update(:id);pg95@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -P 1 -c 32 -j 32 -T 60progress: 1.0 s, 44468.6 tps, lat 0.638 ms stddev 1.028progress: 2.0 s, 46454.6 tps, lat 0.687 ms stddev 1.078progress: 3.0 s, 46644.0 tps, lat 0.685 ms stddev 1.112progress: 4.0 s, 46005.8 tps, lat 0.693 ms stddev 1.116progress: 5.0 s, 45636.1 tps, lat 0.700 ms stddev 1.107progress: 6.0 s, 45518.6 tps, lat 0.701 ms stddev 1.117progress: 7.0 s, 45674.2 tps, lat 0.699 ms stddev 1.090progress: 8.0 s, 45288.2 tps, lat 0.705 ms stddev 1.150progress: 9.0 s, 45074.2 tps, lat 0.708 ms stddev 1.125......progress: 56.0 s, 36654.9 tps, lat 0.871 ms stddev 1.590progress: 57.0 s, 36694.7 tps, lat 0.870 ms stddev 1.530progress: 58.0 s, 36511.4 tps, lat 0.875 ms stddev 1.577progress: 59.0 s, 37118.4 tps, lat 0.860 ms stddev 1.502progress: 60.0 s, 36207.2 tps, lat 0.880 ms stddev 1.556transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 32number of threads: 32duration: 60 snumber of transactions actually processed: 2456377latency average: 0.778 mslatency stddev: 1.327 mstps = 40934.377749 (including connections establishing)tps = 41005.456116 (excluding connections establishing)statement latencies in milliseconds:-0.001878 \setrandom id 1 50000000.775430 select ins_update(:id);postgres=# select * from pg_stat_all_tables where relname='uk_test';-[ RECORD 1 ]-------+------------------------------relid | 16930schemaname | publicrelname | uk_testseq_scan | 0seq_tup_read | 0idx_scan | 13841219idx_tup_fetch | 3655255n_tup_ins | 1941192n_tup_upd | 515185n_tup_del | 0n_tup_hot_upd | 482125n_live_tup | 1941184n_dead_tup | 36686n_mod_since_analyze | 0last_vacuum |last_autovacuum | 2015-06-23 16:03:39.271969+08last_analyze |last_autoanalyze | 2015-06-23 16:04:38.928135+08vacuum_count | 0autovacuum_count | 1analyze_count | 0autoanalyze_count | 2
使用insert into ... on conflict ... update set ...
性能:
postgres=# select pg_stat_reset_single_table_counters('uk_test'::regclass);postgres=# truncate uk_test ;postgres=# checkpoint;pg95@db-172-16-3-150-> vi test.sql\setrandom id 1 5000000insert into uk_test values (:id,'test',now()) on conflict on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;pg95@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -P 1 -c 32 -j 32 -T 60progress: 1.0 s, 62283.4 tps, lat 0.430 ms stddev 0.823progress: 2.0 s, 64971.9 tps, lat 0.490 ms stddev 0.969progress: 3.0 s, 65560.7 tps, lat 0.486 ms stddev 0.937progress: 4.0 s, 64161.9 tps, lat 0.496 ms stddev 0.956progress: 5.0 s, 64003.1 tps, lat 0.498 ms stddev 0.994progress: 6.0 s, 63739.1 tps, lat 0.499 ms stddev 1.007progress: 7.0 s, 62974.2 tps, lat 0.505 ms stddev 1.005progress: 8.0 s, 62266.2 tps, lat 0.512 ms stddev 1.006......progress: 33.0 s, 53306.1 tps, lat 0.598 ms stddev 1.239progress: 34.0 s, 52924.6 tps, lat 0.602 ms stddev 1.275progress: 35.0 s, 52799.8 tps, lat 0.603 ms stddev 1.268progress: 36.0 s, 52527.6 tps, lat 0.607 ms stddev 1.264progress: 37.0 s, 51948.2 tps, lat 0.613 ms stddev 1.303progress: 38.0 s, 51547.3 tps, lat 0.618 ms stddev 1.330progress: 39.0 s, 50852.2 tps, lat 0.626 ms stddev 1.348......progress: 57.0 s, 45127.5 tps, lat 0.707 ms stddev 1.614progress: 58.0 s, 44983.0 tps, lat 0.708 ms stddev 1.573progress: 59.0 s, 45179.3 tps, lat 0.706 ms stddev 1.585progress: 60.0 s, 45045.4 tps, lat 0.708 ms stddev 1.579随着数据重复率越来越高,更新越来越多,TPS呈现下降趋势。transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 32number of threads: 32duration: 60 snumber of transactions actually processed: 3237650latency average: 0.589 mslatency stddev: 1.264 mstps = 53954.428043 (including connections establishing)tps = 54087.133269 (excluding connections establishing)statement latencies in milliseconds:-0.002449 \setrandom id 1 50000000.586167 insert into uk_test values (:id,'test',now()) on conflict on constraint uk_test_pkey do update set info=excluded.info,crt_time=excluded.crt_time;postgres=# select * from pg_stat_all_tables where relname='uk_test';-[ RECORD 1 ]-------+------------------------------relid | 16930schemaname | publicrelname | uk_testseq_scan | 0seq_tup_read | 0idx_scan | 10869657idx_tup_fetch | 2624947n_tup_ins | 2383669n_tup_upd | 853982n_tup_del | 1n_tup_hot_upd | 800533n_live_tup | 2360714n_dead_tup | 27022n_mod_since_analyze | 811433last_vacuum |last_autovacuum | 2015-06-23 16:01:41.707076+08last_analyze |last_autoanalyze | 2015-06-23 16:01:42.553299+08vacuum_count | 0autovacuum_count | 1analyze_count | 0autoanalyze_count | 1
可以看到性能提升是非常明显的。
1. http://www.postgresql.org/docs/devel/static/sql-insert.html#SQL-ON-CONFLICT
2. http://blog.163.com/digoal@126/blog/static/163877040201541094137923/
0 0
- Why "INSERT ... ON CONFLICT DO NOTHING/UPDATE" Good for performance
- Solutions for INSERT OR UPDATE on SQL Server
- Do nothing
- Greenplum merge insert 用法与性能 (insert on conflict)
- Greenplum merge insert 用法与性能 (insert on conflict)
- Greenplum merge insert 用法与性能 (insert on conflict)
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- insert...on duplicate key update
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- INSERT ... ON DUPLICATE KEY UPDATE
- Purrr package for R is good for performance
- 笔试题——触发器
- 排序法
- 黑马程序员_反射
- Vuejs demo
- 第二阶段来啦啦啦!!!
- Why "INSERT ... ON CONFLICT DO NOTHING/UPDATE" Good for performance
- C语言中微妙的bug(一)
- 解析文件
- 1
- 面试小结三:关于算法数据结构的面试题整理(待续)
- StringBuffer和一些算法 冒泡 二分 选择
- bzoj 2748 [HAOI2012]音量调节
- 九度OJ 1005:Graduate Admission (排序)
- MySQL笔记