Tuesday, June 14, 2011

Ever wondered about transactions and threads?

No? Well you really should! When transaction systems were first developed they were single-threaded (where a thread is defined to be an entity which performs work, e.g., a lightweight process, or an operating-system process.) Executing multiple threads within a single process was a novelty! In such an environment the thread terminating the transaction is, by definition, the thread that performed the work. Therefore, the termination of a transaction is implicitly synchronized with the completion of the transactional work: there can be no outstanding work still going on when the transaction starts to finish.

With the increased availability of both software and hardware multi-threading, transaction services are now being required to allow multiple threads to be active within a transaction (though it’s still not mandated anywhere, so if this is something you want then you may still have to look around the various implementations). In such systems it is important to guarantee that all of these threads have completed when a transaction is terminated, otherwise some work may not be performed transactionally.

Although protocols exist for enforcing thread and transaction synchronization in local and distributed environments (commonly referred to as checked transactions), they assume that communication between threads is synchronous (e.g., via remote procedure call). A thread making a synchronous call will block until the call returns, signifying that any threads created have terminated. However, a range of distributed applications exists (and yours may be one of them) which require extensive use of concurrency in order to meet real-time performance requirements and utilize asynchronous message passing for communication. In such environments it is difficult to guarantee synchronization between threads, since the application may not communicate the completion of work to a sender, as is done implicitly with synchronous invocations.

As we’ve just seen, applications that do not create new threads and only use synchronous invocations within transactions implicitly exhibit checked behavior. That is, it is guaranteed that whenever the transaction ends there can be no thread active within the transaction which has not completed its processing. This is illustrated below, in which vertical lines indicate the execution of object methods, horizontal lines message exchange, and the boxes represent objects.



The figure illustrates a client who starts a transaction by invoking a synchronous ‘begin’ upon a transaction manager. The client later performs a synchronous invocation upon object a that in turn invokes object b. Each of these objects is registered as being involved in the transaction with the manager. Whenever the client invokes the transaction ‘end’ upon the manager, the manager is then able to enter into the commit protocol (of which only the final phase is shown here) with the registered objects before returning control to the client.

However, when asynchronous invocation is allowed, explicit synchronization is required between threads and transactions in order to guarantee checked (safe) behavior. The next figure illustrates the possible consequences of using asynchronous invocation without such synchronization. In this example a client starts a transaction and then invokes an asynchronous operation upon object a that registers itself within the transaction as before. a then invokes an asynchronous operation upon object b. Now, depending upon the order in which the threads are scheduled, it’s possible that the client might call for the transaction to terminate. At this point the transaction coordinator knows only of a’s involvement within the transaction so enters into the commit protocol, with a committing as a consequence. Then b attempts to register itself within the transaction, and is unable to do so. If the application intended the work performed by the invocations upon a and b to be performed within the same transaction, this may result in application-level inconsistencies. This is what checked transactions are supposed to prevent.

Some transaction service implementations will enforce checked behavior for the transactions they support, to provide an extra level of transaction integrity. The purpose of the checks is to ensure that all transactional requests made by the application have completed their processing before the transaction is committed. A checked transaction service guarantees that commit will not succeed unless all transactional objects involved in the transaction have completed the processing of their transactional requests. If the transaction is rolled back then a check is not required, since all outstanding transactional activities will eventually rollback if they are not told to commit.

As a result, most (though not all) modern transaction systems provide automatic mechanisms for imposing checked transactions on both synchronous and asynchronous invocations. In essence, transactions must keep track of the threads and invocations (both synchronous and asynchronous) that are executing within them and whenever a transaction is terminated, the system must ensure that all active invocations return before the termination can occur and that all active threads are informed of the termination. This may sound simple, but believe us when we say that it isn’t!

Unfortunately this is another aspect of transaction processing that many implementations ignore. As with things like interposition (for performance) and failure recovery, it is an essential aspect that you really cannot do without. Not providing checked transactions is different from allowing checking to be disabled, which most commercial implementations support. In this case you typically have the ability to turn checked transactions off when you know it is safe, to help improve performance. If you think there is the slightest possibility you’ll be using multiple threads within the scope of a single transaction or may make asynchronous transactional requests, then you’d better find out whether your transaction implementation it up to the job. I'm not even going to bother about the almost obligatory plug for JBossTS here ;-)

No comments: