事务的ACID，PostgreSQL的事务隔离级别实验，非常棒的----数据架构师的PostgreSQL修炼

来源：互联网发布：matlab最优化算法例题编辑：程序博客网时间：2024/06/05 18:13

1.测试数据集

本节会做一些实验，所以提前建立实验需要用到的数据集。

[plain] view plain copy
postgres=# create table account(id serial primary key, first_name varchar(100), last_name varchar(100), account_bal numeric(10,2));  
CREATE TABLE  

插入数据：

[plain] view plain copy
postgres=# insert into account  values(1, 'Robin', 'Wan', 100000);  
INSERT 0 1  
postgres=# insert into account  values(2, 'William', 'Blake', 7000000.00);  
INSERT 0 1  
postgres=# insert into account  values(535889,'Riley', 'Truden', 0.00);  
INSERT 0 1  
postgres=#  

2.事务的ACID概念

2.1 A是原子性

就是一个transaction内的SQL语句，要么全部执行，要么都不执行。

2.2 C是一致性

数据库transaction前是一致的，那么这个transaction执行后，数据库也得是一致的。

2.3 I是隔离性

隔离性意味着用户并不会受同一时间段内其他用户操作的影响；看上去各个事务像是一前一后串行（serially）发生的，哪怕当前有很多个并发事务。

SQL标准支持多种隔离级别，根据具体需要设置隔离级别。

2.3.1 Read uncommitted(未提交的读取）

一个session的事务即使没有commit，也对其他session可见。

这其实是dirty read(脏读)。

事务T1：William Blake想向Riley Truden转10，000美元，在p5时刻撤销转账。

事务T2：因为事务隔离级别是Read uncommitted，所以在T2中能看到收到了10，000美元（脏读），然后撤销了。

2.3.2 Read committed(提交的读取）

某session的事务只有commit了，才会对其他session可见；未commit的事务，对其他session不可见；所以脏读现象将不会再发生。

但是Read committed无法保证相同的select语句在同一个事务中两次执行获得相同的结果（non-repeatable read, 不可重复读）。

第二次select可能获得不同的结果，如果有另外的session在几乎同一时间修改并commit了数据。

2.3.3 Repeatable read（可重复读）

可重复读采的锁定机制比提交的读取高。

在这个层面，讨论的是幻读(phantom read)。

我个人读这块的时候，觉得非常难区分开可重复读和幻读，所以把原文列出来，然后加入自己的理解，可能不一定准确，只能随着阅读的材料越来越多，来加深理解：

The dataset returned by successive selects could have changed because of another session inserting records, which met the criteria and committed.

(因为其他session在满足可重复读条件下，插入并commit了记录，导致连续的select返回的数据集不同。)

These are phantom records.

（这些新插入的记录是幻影记录。）

However, the result set, which met the SELECT criteria itself, would be locked and hence wouldn't be modied by another session.

(然而，满足select条件的返回的数据集，在这个过程中会被锁定【这是与read commit的区别】，并且不会被另外一个session修改。）

幻读 is the key difference between read committed and repeatable read isolation.

（这是read committed 和 repeatable read的主要区别。）

This implies that the locking level has become a bit stricter.

（这意味着repeatable read的锁定级别要更严格。）

以下是一个幻读(phantom read)的例子：

根据上面介绍，我个人的理解如下:

在可重复读的隔离级别下，T1会对 balance < 1000 and balance > 100的记录加锁；但是T2修改的是balance = 1500的记录（不在锁定的记录中），所以T2可以commit成功。

这样导致新的记录balance = 500 and ID =123像个幻影一样，突然出现。

-----------------------------------------------------------------------

补充材料：

看来这个可重复读和幻读确实比较绕，难于理解，我觉得这哥们儿的理解也很正，特此抄一份存档。

如何理解 MySQL 事务中的不可重复读和幻读问题？或者说 READ COMMIT 和 REPEAT COMMIT 两种隔离性的区别在哪？

刚好学习一下这个，我就粗浅的来说说我的想法：
Read Committed(不可重复读)：
假设事务1读取了一条记录(select user_name from user where user_id = 1), 得到user_name = '456', 事务1暂时没提交。
事务2更新了一条记录(update user set user_name = '123' where user_id = 1),事务2提交。
此时事务1再次select user_name from user where user_id = 1得到了user_name = '123', 这样就导致事务1在读取同一行数据却得到不同的user_name。
这就是所谓的不可以重复读。

Repeatable Read(可重复读,会产生幻读):
这个跟不可重复读相反,当事务1查询到user_id=1时锁定该记录，事务2修改user_id=1记录提交会失败，这样保证了可重复读。

幻读的话就是当事务2插入一条新的数据id为2并提交，事务1由于可重复读的性质，只能在表中查到id为1的数据，如果此时事务1插入id为2的数据则会产生错误，

因为此时表中已经有了id为2的数据,但是事务1只看到了id为1的数据。

-------------------------------------------------------------------------------------

2.3.4 Serializable（可串行化）

这是最严格的隔离级别，事务串行化的执行。在这个隔离级别上，幻读是不可能的。

总结下来，每个层级允许的不一致之处概括如下：

2.4 D是持久性

持久性指的是事务被提交后可以提供的确定性程度，即使发生电源故障、操作系统故障、硬件故障等。。。

PostgreSQL使用Write Ahead Log（WAL)来实现持久性，这个我们在前面章节有过介绍。

3. PostgreSQL事务隔离级别实验

PostgreSQL不支持read uncommitted 隔离级别，不允许脏读（dirty read)。

PostgreSQL实现的repeatable read，也不会产生幻读(phantom read)。。。话说，这是为啥呢？这样repeatable read 和 serializable有啥区别呢还？

3.1 Read committed

这个是PostgreSQL默认的事务隔离级别，在这个级别，每个语句看到的是该语句执行前所有已经committed数据的快照。

文中没有给出过多的实验，仅仅列出了几个有用的命令。

列出默认事务隔离级别：

[plain] view plain copy
postgres=# show default_transaction_isolation;  
 default_transaction_isolation  
-------------------------------  
 read committed  
(1 row)  

如果不加"begin....commit/rollback"，每一个SQL语句都是一个transaction，都有自己的transaction ID:

[plain] view plain copy
postgres=# select txid_current();  
 txid_current  
--------------  
          877  
(1 row)  
  
postgres=# select txid_current();  
 txid_current  
--------------  
          878  
(1 row)  
  
postgres=# select txid_current();  
 txid_current  
--------------  
          879  
(1 row)  

以下实验，说明在一个transaction内，transaction ID 不会变化，now()不会变化，但是clock_timestamp()会变化：

[plain] view plain copy
postgres=# begin;  
BEGIN  
postgres=# select txid_current();  
 txid_current  
--------------  
          880  
(1 row)  
  
postgres=# select now();  
              now  
-------------------------------  
 2017-09-18 20:20:45.686786+08  
(1 row)  
  
postgres=# select clock_timestamp();  
        clock_timestamp  
-------------------------------  
 2017-09-18 20:21:25.332571+08  
(1 row)  
  
postgres=# select txid_current();  
 txid_current  
--------------  
          880  
(1 row)  
  
postgres=# select now();  
              now  
-------------------------------  
 2017-09-18 20:20:45.686786+08  
(1 row)  
  
postgres=# select clock_timestamp();  
        clock_timestamp  
-------------------------------  
 2017-09-18 20:21:47.391577+08  
(1 row)  
  
postgres=# commit ;  
COMMIT  

commit后，transaction ID增加：

[plain] view plain copy
postgres=# select txid_current();  
 txid_current  
--------------  
          881  
(1 row)  

3.2 repeatable read

在这个级别，每个语句看到的是该事务执行前所有已经committed数据的快照。

（所以在这个隔离级别，同一个transaction中多次执行同一个select语句会得到相同的结果。）

而read comitted是每个SQL语句执行前所有committed数据的快照。

试图修改事务隔离级别为repeatable read：

[plain] view plain copy
postgres=# begin;  
BEGIN  
postgres=# show transaction_isolation;  
 transaction_isolation  
-----------------------  
 read committed  
(1 row)  
  
postgres=# set transaction isolation level repeatable read;  
SET  
postgres=# commit;  
COMMIT  

再次检查事务隔离级别，仍然为read committed(所以必须在begin....commit/rollback中修改事务隔离级别，并执行其它SQL语句)：

[plain] view plain copy
postgres=# show transaction_isolation;  
 transaction_isolation  
-----------------------  
 read committed  
(1 row)  

一个详细的实验：

准备数据集：

[plain] view plain copy
postgres=# delete from account;  
DELETE 1  
postgres=# INSERT INTO account (first_name,last_name, account_bal) values ( 'Jane', 'Adam', 1000);  
INSERT 0 1  
postgres=# select * from account;  
 id | first_name | last_name | account_bal  
----+------------+-----------+-------------  
  4 | Jane       | Adam      |     1000.00  
(1 row)  

按照如下给定的顺序，在两个session中执行：

T1失败，再次强调repeatable read所说的是同一行数据的reapeatable read(此处就是id=2的行):

[plain] view plain copy
postgres=# begin;  
BEGIN  
postgres=# set transaction isolation level repeatable read;  
SET  
postgres=# select * from account;  
 id | first_name | last_name | account_bal  
----+------------+-----------+-------------  
  4 | Jane       | Adam      |     1000.00  
(1 row)  
  
postgres=# update account set account_bal=2000 where id = 4;  
ERROR:  could not serialize access due to concurrent update  

[plain] view plain copy
postgres=# select * from account;  
ERROR:  current transaction is aborted, commands ignored until end of transaction block  

T2成功：

[plain] view plain copy
postgres=# begin;  
BEGIN  
postgres=# set transaction isolation level repeatable read;  
SET  
postgres=# select * from account;  
 id | first_name | last_name | account_bal  
----+------------+-----------+-------------  
  4 | Jane       | Adam      |     1000.00  
(1 row)  
  
postgres=# update account set account_bal=2000 where id = 4;  
UPDATE 1  
postgres=# commit;  
COMMIT  

3.3 Serializable level

这个是最严格的级别。

简单的说在这个级别，不会block任何事务；但是会探测冲突，并且abort事务。

假设当前account表中有如下数据，据此安排一个serializable 隔离级别的实验：

[plain] view plain copy
postgres=# select * from account;  
 id | first_name | last_name | account_bal  
----+------------+-----------+-------------  
  4 | Jane       | Adam      |      100.00  
  5 | John       | Doe       |      200.00  
(2 rows)  

[plain] view plain copy
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  

按照如下给定时序，在两个session里执行：

T1失败(第二个select就失败了）：

[plain] view plain copy
postgres=# set default_transaction_isolation to serializable;  
SET  
postgres=# show transaction_isolation ;  
 transaction_isolation  
-----------------------  
 serializable  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name,account_bal) values ('Jane','Doe',100);  
INSERT 0 1  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
ERROR:  could not serialize access due to read/write dependencies among transactions  
DETAIL:  Reason code: Canceled on identification as a pivot, during conflict out checking.  
HINT:  The transaction might succeed if retried.  
postgres=#  

T2成功：

[plain] view plain copy
postgres=# set default_transaction_isolation to serializable;  
SET  
postgres=# show transaction_isolation ;  
 transaction_isolation  
-----------------------  
 serializable  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name, account_bal) values ('John','Doe',100);  
INSERT 0 1  
postgres=# commit;  
COMMIT  

---------------------------------------------------------------------------------------
华丽丽的分割线，让我们针对以上步骤，做一个repeatable read隔离级别的实验吧。

初始数据：

[plain] view plain copy
postgres=# select * from account;  
 id | first_name | last_name | account_bal  
----+------------+-----------+-------------  
  4 | Jane       | Adam      |      100.00  
  5 | John       | Doe       |      200.00  
(2 rows)  

按照上面图片中给定的时序，在两个session里执行：

T1成功，但是没看到T2 insert的新数据，因为repeatable read的快照是在事务开始时的：

[plain] view plain copy
postgres=# set default_transaction_isolation to "repeatable read";  
SET  
postgres=# show transaction_isolation ;  
 transaction_isolation  
-----------------------  
 repeatable read  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name,account_bal) values ('Jane','Doe',100);  
INSERT 0 1  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 200.00  
 John       | <span style="background-color:rgb(255,255,255);">200.00</span>  
(2 rows)  
  
postgres=# commit;  
COMMIT  

T2成功：

[plain] view plain copy
postgres=# set default_transaction_isolation to "repeatable read";  
SET  
postgres=# show transaction_isolation;  
 transaction_isolation  
-----------------------  
 repeatable read  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name, account_bal) values ('John','Doe',100);  
INSERT 0 1  
postgres=# commit;  
COMMIT  

------------------------------------------------------------

华丽丽的分割线，让我们针对以上步骤，做一个read committed隔离级别的实验吧。

初始数据：

T1成功，并且看到了T2 Insert的数据，因为在read committed隔离级别下，快照是每个SQL执行前的：

[plain] view plain copy
postgres=# set default_transaction_isolation to "read committed";  
SET  
postgres=# show transaction_isolation ;  
 transaction_isolation  
-----------------------  
 read committed  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name,account_bal) values ('Jane','Doe',100);  
INSERT 0 1  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 200.00  
 John       | 300.00  
(2 rows)  
  
postgres=# commit;  
COMMIT  

T2成功：

[plain] view plain copy
postgres=# set default_transaction_isolation to "read committed";  
SET  
postgres=# show transaction_isolation ;  
 transaction_isolation  
-----------------------  
 read committed  
(1 row)  
  
postgres=# begin;  
BEGIN  
postgres=# select first_name, sum(account_bal) from account group by first_name;  
 first_name |  sum  
------------+--------  
 Jane       | 100.00  
 John       | 200.00  
(2 rows)  
  
postgres=# INSERT INTO account (first_name,last_name, account_bal) values ('John','Doe',100);                                                                 INSERT 0 1  
postgres=# commit;  
COMMIT  

阅读全文

0 0