STM and MVCC considerations

来源：互联网发布：linux tomcat线程挂起编辑：程序博客网时间：2024/06/03 20:32

source: https://pveentjer.wordpress.com/2008/10/05/stm-and-mvcc-considerations/

source: https://pveentjer.wordpress.com/2008/10/08/stm-and-mvcc-considerations-ii/

MVCC provides a stable view over the system by maintaining multiple versions of the same data, so it doesn’t need to worry about other transactions updating data it has read, because the transaction won’t see these updates.

The only thing it needs to worry about it is detecting if another transaction has updated the same data the transaction wants to update. If such a write conflict is found, the transaction can’t commit.

If better serialization behavior is needed, all data that was touched by the transactions (the reads and the writes) needs to be checked for conflicts.

Of course this improved isolation is not free because there is more data to check. And it can leads to more retries of transactions and this increases the chance of a livelock. But is nice to see that this problem can be solved.

STM and MVCC considerations

I’m working on a Software Transaction Memory (STM) implementation for experimentation purposes. And it uses Multi Version Concurrency Control (MVCC) just like Oracle. The big advantage of MVCC is that it is highly concurrent because it doesn’t rely on locks to prevent other transactions from interfering when you only need to read data. When a transaction begins, it gets a snapshot of the world and this snapshot remains constant even if other transactions are making changes. Changes made by the transaction are visible to itself of course.

The problem however is that MVCC could give subtle isolation problems. Eg.

StmStack stack1 = new StmStack();StmStack stack2 = new StmStack();void foo(){transaction{stack1.push(stack2.size());}}void bar(){transaction{stack2.push(stack1.size());}}

If foo en bar are called concurrently, you would expect one of the following states:

stack1 contains 0 and stack2 contains 1 (if transaction1 is executed first)
stack1 contains 1 and stack2 contains 0 (if transaction2 is executed first)

But with MVCC stack1 and stack2 both can contain the value 0! This is because the transactions don’t see the changes made by other the transaction, and since there is no write conflict (both transactions write to a different stack), both transactions can be committed. If this is an issue in practice.. I don’t know.. but it is something to keep in mind. I have placed a similar post about this issue in Oracle.

STM and MVCC Considerations II

In my previous post I mentioned an isolation anomaly that can happen with MVCC systems (for example in Oracle). The cause of this issue is that a transaction in a MVCC system doesn’t care if the data it only reads, is updated by a different transaction. MVCC provides a stable view over the system by maintaining multiple versions of the same data, so it doesn’t need to worry about other transactions updating data it has read, because the transaction won’t see these updates. The only thing it needs to worry about it is detecting if another transaction has updated the same data the transaction wants to update. If such a write conflict is found, the transaction can’t commit. If better serialization behavior is needed, all data that was touched by the transactions (the reads and the writes) needs to be checked for conflicts.

If we look at the example again:

StmStack stack1 = new StmStack();StmStack stack2 = new StmStack();void foo(){transaction{stack1.push(stack2.size());}}void bar(){transaction{stack2.push(stack1.size());}}

If transaction1 (calling foo) executes concurrently with transaction2 (calling bar) one of the the following 2 scenario’s happens (the commit is atomic):

transaction1 commits before transaction2: transaction2 is aborted because the stack1 it has read, has been updated by transaction1
transaction2 commits before transaction1: transaction1 is aborted because the stack2 it has read, has been updated by transaction2

With the STM implementation I’m currently working on, it is quite easy to realize. When an object is read in a transaction, and it has not been read before, the dehydrated version (stored in the STM) of that object is asked to provide a hydrated version. This hydrated version is logged inside the transaction, to make sure that the same hydrated version is returned every time. So since the transaction tracks all reads and writes, enhancing the conflict detection would be easy.

0 0