Sunday, March 13, 2011

Optimisations to the transaction protocol

There are several variants to the standard two-phase commit protocol that are worth knowing about because they can have an impact on performance and failure recovery. We shall briefly describe those that are the most common variants on the protocol:

• Presumed abort: if a transaction is going to roll back then it may simply record this information locally and tell all enlisted participants. Failure to contact a participant has no affect on the transaction outcome; the transaction is effectively informing participants as a courtesy. Once all participants have been contacted the information about the transaction can be removed. If a subsequent request for the status of the transaction occurs there will be no information available and the requestor can assume that the transaction has aborted (rolled back). This optimization has the benefit that no information about participants need be made persistent until the transaction has decided to commit (i.e., progressed to the end of the prepare phase), since any failure prior to this point will be assumed to be an abort of the transaction.

• One-phase: if there is only a single participant involved in the transaction, the coordinator need not drive it through the prepare phase. Thus, the participant will simply be told to commit and the coordinator need not record information about the decision since the outcome of the transaction is solely down to the participant.

• Read-only: when a participant is asked to prepare, it can indicate to the coordinator that no information or data that it controls has been modified during the transaction. Such a participant does not need to be informed about the outcome of the transaction since the fate of the participant has no affect on the transaction. As such, a read-only participant can be omitted from the second phase of the commit protocol.

So these optimisations let us get around some concerns that two-phase commit, particularly in a distributed environment, is overkill. Most modern transaction manager implementations will support all of these and when you consider that many transactional applications use only a single participant, or don't modify state, then you can see how the above considerations can dramatically reduce (or even remove) any overhead that may appear to be there at first glance. We'll go into this a bit more in a subsequent posting. But next up for consideration is what happens when we can't guarantee transactional semantics?

No comments: