Monday, October 7, 2019

Software Transactional Memory with Quarkus

We have recently contributed a quarkus extension called quarkus-narayana-stm which simplifies the use of STM in your microservices.

It will be available in the 0.24.0 quarkus release. If you would like to experiment with it before this release then you can either take one of the nightly builds or you can build it locally by git cloning the quarkus repo and then run the build. This will add the io.quarkus:quarkus-narayana-stm:999-SNAPSHOT maven dependency to your local maven repository and you may then get started by following the guide. There is also a quickstart that provides a worked example of how to use it in your microservices. The example shows how to manage concurrent accesses to a single counter. More sophisticated usage patterns are the norm but this simple example does give a flavour of how easy it is to manage concurrency with the Narayana STM implementation.

Wednesday, September 18, 2019

Heuristic exceptions

A transaction is finished either with commit or rollback. But have you considered that the third transaction outcome is <<unspecified>>? This third type of outcome may occur when the participant does not follow the coordinator's orders and makes a decision on the contrary. Then the outcome of the whole transaction could be inconsistent. Some participants could follow the coordinator guidelines to commit while the disobedient participant rolls-back. In such a case, the coordinator cannot report back to the application that "the work was successfully finished". From the perspective of the outside observer, the consistency is damaged. The coordinator itself cannot do much more. It followed the rules of the protocol but the participants disobeyed. Such transaction can only be marked with <<unspecified>> result &mdash which is known as heuristic outcome.
The resolution of that outcome requires a third-party intervention. Saying differently somebody has to go and verify the state of data and make corrections.

XA specification

In scope of this article we talk about the two phase commit protocol and how the XA specification uses it.
XA specification takes the two-phase commit, the abstractly defined consensus protocol, and carries it down to the ground of implementation. It defines rules for communication amongst parties, prescribes state model and assesses outcomes, exceptions etc.
Where the 2PC talks about coordinator and participants the XA specification define a model where the coordinator is represented by a transaction manager (TM), the participant is modelled as the resource manager (RM) and the employer of them is an application program which talks to both of them.
The transaction manager drives a global transaction which enlists resource managers as participants responsible for part of work. For the work managed by RM is used term transaction branch. The resource manager is represented normally by a Java code that implements the JTA API while it uses some internal mechanism to call the underlying data resource. The resource manager could be represented by a JDBC driver and then underlying resource could be the PostgreSQL database running in some distinct datacenter.

JTA Specification

The Java Transaction API is a projection of XA specification to the language of Java. In general, it's a high-level API that strives to provide a comprehensible tool for handling transactional code based on the XA specification.

Several purposes of JTA

The JTA is meant to be used first by an application developer. In terms of XA specification, it would be the application program. Here we have the UserTransaction interface. It gives the chance to begin and commit the transaction. In terms of XA specification, it's a representation of the global transaction. In a newer version (from version 1.2) JTA defines the annotations like Transactional That gives the user chance to control the application scope declaratively. As you can notice neither with the UserTransaction nor with he @Transactional annotation you can't do much more than to define where the transaction begins and where it ends. How to put a participant into the transaction? That limited capability is because developer is anticipated running the application in a managed environment. It could be for example Java EE application server or a different container.

The container is a second consumer of the JTA API. JTA gives the entry point to the world of transaction management. The container uses the interface TransactionManager &mdash which provides ability of managing transaction scope but also gives access to the Transaction object itself. Transaction is used to enlist the resource (the participant) to the global transaction. As the resource for enlistment is used the XAResource interface in JTA. The XAResource is managed by the transaction manager and is operated by the resource manager. Then container may arrange Synchronizations are callbacks called at the start and the end of the 2PC processing (used e.g. by JPA).

The third perspective where JTA participates in is communication with resource managers. The API defines the class XAResource (represents a participant resource to be enlisted to the global transaction), XAException (represents an error state defined in the XA specification) and Xid. The Xid represents an identifer which is unique for each transaction branch and consists of two parts — first an unique identifier of the global transaction and second an unique identifier of the resource manager. If you want to see how the transaction manager uses these classes to communicate with the resource manager take a look at the example from documentation of SQL Server.

Note:If you study the JTA API by looking into the javadoc and strolling the package summary then you can wonder about some other classes which were not mentioned here. Part of the javax.transaction package are interfaces used exclusively by JTS (transaction services running with ORB). That's mitigated by the fact the Java SE 11 removed support of ORB and those classes were removed from JDK as well.
Plus the JTA classses are now (from Java SE 11) split over Java SE and Java EE bundles as the package javax.transaction.xa is solely part of the Java SE while javax.transaction belongs to the Jakarta EE API.

Type of failures

Now when we have talked the model let's see the failure states. First, it's needed to realize that a failure in the application does not mean failure from the protocol perspective. If there is trouble in the application code or a network is transiently not available such occurrences can lead to rollback. It's a failure from the perspective of the application but for the transaction manager the rollback is just another valid state to shift to.
Even if the whole transaction manager crashes (e.g. the underlying JVM is killed) the system still should maintain data consistency and the transaction is recovered when the transaction manager (or rather recovery manager) comes back to life.

What's the trouble for the protocol is an unexpected behaviour of the resource manager (or the backed resource). We can track basically two incident types. A heuristic outcome where the resource deliberately decides to process some action (which is different from transaction manager decision). Or a heuristic outcome caused by a bug in the code (either of the transaction manager or the resource manager).

Let's discuss some examples for these types.
The deliberate decision could be a situation where the transaction manager calls the prepare on the database. The database confirms the prepare — it promises to finish the transaction branch with commit. But unfortunately the transaction manager crashes and nobody comes to restart it for a long time. The database decides that it's pretty long time to hold resources and delaying other transactions to proceed. Thus it decides to commit the work. The processing in the database may continue from that time. But later the transaction manager is restarted and it tries to commit all other branches belonging to the global transaction (let's say e.g. a JMS broker). That resource responds with an error and the transaction manager decides to rollback the whole global transaction. Now it accesses the database with the request for the rollback. But the database already committed its branch — the heuristic outcome just occurred.

An example for the bug in the code could be the PostgreSQL database driver. The driver was returning a wrong error code in case of intermittent connection failure. The XA specification defines that in such case the XAException should be thrown and it has to carry one of the following error codes — either the XAException.XAER_RMFAIL or the XAException.XA_RETRY. But the JDBC driver was returning XAException.XAER_RMERR. Such error code means that an irrecoverable error occurred. It makes the transaction manager think there is no way of automatic recovery and it switches the state of such transaction to heuristic immediately.

Heuristic exceptions

As the last part of this article we take a look on the heuristic outcomes of the transaction. The heuristics is represented with an exception being thrown. The exception reports the reason of the failure. It does so with error code or with type of class.

There are two main types of exception classes. First type is the XAException. This one is part of the communication contract between the transaction manager and the resource manager. It should not happen for the application code to obtain this type of exception. But for sure you can observe the XAException in the container log. It shows that there happened an error during transaction processing.

The second type is represented with multiple classes named Heuristic*Exception. These are exceptions that application code works with. They are thrown from the UserTransaction methods and they are checked.

Heuristic outcome with XAResource

The XAException reports reason of failure with the use of error codes. XA specification defines the meaning. It depends on the context in which it's used. For example the code XAException.XA_RETRY could be used for reporting error from commit with meaning to retry the commit action. But on the other hand it's not permitted to be used as an error code for the one-phase commit.

Then where are those heuristic states? Let's check what could happen when the transaction manager calls the XAResource calls of prepare, commit and rollback.
If the prepare is called then there is not many chances that heuristic occurs. At that time no promise from the resource is placed and the work can be easily roll-back or the worse timed-out. The only occurrence that can bring the system to the heuristic state is if the resource manager returns undefined code for this phase. But that is cause the most probably only by a bug in the implementation. Consult the XA specification which are those.
The more interesting are the commit and rollback calls. The commit and rollback are (or could be) called after the prepare. Heuristic exception means that the resource promised to commit (he acknowledge the prepare call) but it does not wait for transaction manager to command it for the next action and it finished the transaction branch on its own. The error codes are those with prefix XA_HEUR*. The decision on its own does not mean an error for the protocol in all cases.

Let's talk about rollback now. The global transaction was successfully prepared but the transaction manager decided at the end to roll-back it. It calls the rollback to the XAResource. The error XAException.XA_HEURRB announces that the resource manager decided to roll-back the transaction branch prior it was asked for it by the transaction manager. But as the transaction manager decided to go for the roll-back too the heuristic outcome followed the decision.
The XAException.XA_HEURCOM means that all work represented by the transaction branch was committed (at time the rollback is executed on the XAResource). That's bad from the data consistency as some other transaction branches could be already rolled-back.
To explain the meaning of the XAException.XA_HEURMIX it's needed to mention that the transaction branch could consist of several "local transactions". For example, PostgreSQL JDBC driver starts a database transaction to insert data to the database. Later (still in the scope of the same global transaction) it decides to update the data. It starts another database transaction. The transaction manager is clever enough to join those two database transactions which belong to the same database resource (controlled by the same resource manager) under the one transaction branch. It's good as it could reduce the communication overhead. So the XA_HEURMIX says that part of workload involved in the transaction branch was committed and the other part was rolled-back.
The XAException.XA_HEURHAZ says that the resource manager made a decision on its own but it's not capable to say what was the result of such an independent decision.

The most interesting part is the commit call. First it uses the XA_HEUR* exceptions in the same meaning as the rollback call and all what is said in the previous paragraph pays for the commit too. But up to that there are three new error codes. They do not contain word HEUR but in result they mean it. Those are XAER_RMERR which announces that an unspecified error happened during the currently executing commit call. But instead of committing the resource manager had just rolled-back the transaction branch. That means we are in the same state as with the XA_HEURRB The XAER_NOTA says that resource manager does not know anything about this transaction branch. That means the resource manager lost the notion about it and it either commits it or rolled-back it or it may do an arbitrary one in the future. That means we are in the same state as with the XA_HEURHAZ. The last one is the XAER_PROTO which says that the commit was called in a wrong context — for example it was called without the prepare being invoked before. This seems being similar to XAER_NOTA and thus have the same impact as the XA_HEURRB.

Heuristic outcome with the "application exceptions"

For the "application exceptions" it could be considered easier. The heuristic exceptions can be thrown only from the commit call (see UserTransaction javadoc). The UserTransaction gives chance to finish the transaction with commit or roll-back. The roll-back means that transaction branches should be aborted and all work discarded. When UserTransaction.rollback() is called the resource manager had not promised succesful outcome yet. The time the rollback is called is time when all transaction processing data is available only in memory. Thus resource manager has nothing onto decide differently. If there is some trouble the the other types of exceptions are thrown — like IllegalStateException or SystemException (see the UserTransaction javadoc).
But with the UserTransaction.commit it is different. This call means that two-phase commit protocol is to be started and XAResource.prepare/commit/rollback calls are involved. Here the JTA uses the checked exceptions to inform the application to handle the trouble. The application checked exceptions are RollbackException,HeuristicMixedException,HeuristicRollbackException. The RollbackException is not a heuristic exception (at least by name) but still. That exception informs that even the code asked for commit all work (in all transaction branches) was undone by the rollback. The HeuristicMixedException means that some transaction branches were committed and others were rolled-back. This is exception thrown for example if during the commit phase of 2PC. One of the XAResource.commit calls returns XAException.XA_HEURRB (aka it was rolled-back) while the others were succesfully committed.
The HeuristicRollbackException has the same final outcome from the global transaction perspective as the RollbackException. It only emphasizes that the fact that the roll-back was deliberately chosen by all the resources prior to the commit was executed by the transaction manager. In comparison, the RollbackException means that the transaction manager was just trying to commit all resources but during the process of committing trouble occurred and all the work was rolled-back (all resources rolled-back). To be perfectly honest I'm not sure I can't see a real difference between these two.

We talked about all exceptions defined at the UserTransaction.commit definition so we are done. Oh wait, we are not!
There is one more exception defined in the javax.transaction package. It's the HeuristicCommitException. The HeuristicCommitException is not defined at the UserTransaction.commit as even all resources would idependently decide to commit the global transaction result is still just committed. Which is intended as UserTransaction.commit is called. Then what is the purpose of it then?
We need to look into the implementation. It's used at calls of commit and rollback at a subordinate transaction. The subordinate transaction is a transaction which is driven by a parent transaction. The parent transaction (named as top-level as well) manages the subordinate and decides the overall outcome.
When the subordinate transaction is commanded it reports the outcome back to the top-level one. It's a similar relation as the XAResource has to the global transaction. Because the subordinate transaction needs to report heuristic decisions back from the commit and rollback calls the HeuristicCommitException serves for cases when subordinate transaction decided to commit prior the top-level transaction commanded for a final action.

NOTE: Don't interchange the subordinate transaction for the nested transaction. If the nested transaction is aborted the upper transaction can continue in processing and it can finish with commit at the end (but if the top-level transaction rolls-back the nested transaction has to roll-back as well).
The subordinate transaction is a composite part of the top-level transaction. If the subordinate transaction aborts the top-level one aborts as well.

Summary

That's all. Hopefully, you understand a bit more on the meaning of the heuristic outcomes for the XA and JTA specifications. And for sure you won't be writing code like

try {
  UserTransaction.begin();
  ...
  UserTransaction.commit();
} catch (Throwable t) {
  // some strange error happened so we print it to the log
  t.printStackTrace();
}

Wednesday, June 26, 2019

Expiry scanners and object store in Narayana

What are the expiry scanners?

The expiry scanner serves for garbage collection of aged transaction records in Narayana.
Before elaborating on that statement let's first find out why is such functionality needed.

Narayana object store and transaction records

Narayana creates persistent records when process transactions. These records are saved to the transaction log called Narayana object store. The records are utilized during transaction recovery when a failure of a transaction happens. Usual reasons for the transaction failure is a crash of the JVM or a network connection issue or an internal error on the remote participant. The records are created during the processing of transactions. Then they are removed immediately after the transaction successfully finishes (regardless of the transaction outcome – commit or rollback). That implies that the Narayana log contains only the records of the currently active transactions and the failed ones. The records on active transactions are expected to be removed when the transaction finishes. The records on failed transactions are stored until the time they are recovered – finished by periodic recovery – or by the time they are resolved by human intervention.
...or by the time they are garbage collected by the expiry scanner.

Narayana stores transaction record in a hierarchical structure. The hierarchy location depends on the type of record. The object store could be stored on the hard drive – either as a directory structure, or in the journal store (the implementation which is used is created by ActiveMQ Artemis project), or it can be placed to the database via JDBC connection.

NOTE: Narayana object store saves data about transaction processing, but the same storage is used to persist other runtime data which is expected to survive the crash of the JVM.

Object store records for JTA and JTS

Transaction processing records are stored differently independence whether JTA or JTS mode is used. The JTA runs the transactions inside the same JVM. While JTS is designed to support distributed transactions. When JTS is used, the components of the transaction manager are not coupled inside the same JVM. The components communicate with each other via messages, regardless the components run within the same JVM or as different processes or on different nodes. JTS mode saves more transaction processing data to object store than the JTA alternative.

For standard transaction processing the JTA starts with the enlisting participant under the global transaction. Then two-phase commit starts and prepare is called at each participant. When the prepare 2PC phase ends, the record informing about the success of the phase is stored under the object store. After this point, the transaction is predetermined to commit (until that point the rollback would be processed in case of the failure, see presumed rollback). The 2PC commit phase is processed by calling commit on each participant. After this phase ends the record is deleted from the object store.
The prepare "tombstone record" informs about the success of the phase but contains information on successfully prepared participants which were part of the transaction.
 
This is how the transaction object storage looks like after the prepare was successfully processed. The type which represents the JTA tombstone record is StateManager/BasicAction/TwoPhaseCoordinator/AtomiAction.
data/tx-object-store/
ShadowNoFileLockStore
└── defaultStore
   ├── EISNAME
   │   ├── 0_ffff0a000007_6d753eda_5d0f2fd1_34
   │   └── 0_ffff0a000007_6d753eda_5d0f2fd1_3a
   └── StateManager
       └── BasicAction
           └── TwoPhaseCoordinator
               └── AtomicAction
                   └── 0_ffff0a000007_6d753eda_5d0f2fd1_29
In the case of the JTS, the processing runs mostly the same way. But one difference is that the JTS saves more setup data (created once during initialization of transaction manager, see FactoryContact, RecoveryCoordinator). Then the second difference to JTA is that the JTS stores the information about each prepared participant separately for JTS the participants are separate entities and each of them handles the persistence on his own. Because of that, a "prepare record" is created for each participant separately (see Mark's clarification below in comments).  When XAResource.prepare is called there is created a record type CosTransactions/XAResourceRecord. When the XAResource.commit is called then the record is deleted. After the 2PC prepare is successfully finished the record StateManager/BasicAction/TwoPhaseCoordinator/ArjunaTransactionImple is created and is removed when the 2PC commit phase is finished. The record ArjunaTransactionImple is the prepare "tombstone record" for JTS.
Take a look at how the object store with two participants and finished 2PC prepare phase looks like
data/tx-object-store/
ShadowNoFileLockStore
└── defaultStore
   ├── CosTransactions
   │   └── XAResourceRecord
   │       ├── 0_ffff0a000007_-55aeb984_5d0f33c3_4b
   │       └── 0_ffff0a000007_-55aeb984_5d0f33c3_50
   ├── Recovery
   │   └── FactoryContact
   │       └── 0_ffff0a000007_-55aeb984_5d0f33c3_15
   ├── RecoveryCoordinator
   │   └── 0_ffff52e38d0c_c91_4140398c_0
   └── StateManager
       └── BasicAction
           └── TwoPhaseCoordinator
               └── ArjunaTransactionImple
                   └── 0_ffff0a000007_-55aeb984_5d0f33c3_41

Now, what about the failures?

When the JVM crashes, network error or another transaction error happens the transaction manager stops to process the current transaction. Depending on the type of failure it either abandons the state and passes responsibility to finish the transaction to the periodic recovery manager. That's the case e.g. for the "clean" failures – the JVM crash or the network crash. The periodic recovery starts processing when the system is restarted and/or it periodically retries to connect to the participants to finish the transaction.
Continuing with the object store example above. JVM crashes and further restarts make that periodic recovery to observe the 2PC prepare was finished – there is the AtomicAction/ArjunaTransactionImple record in the object store. The recovery manager lists the participants (represented with XAResources) which were part of the transaction and it tries to commit them.

ARJUNA016037: Could not find new XAResource to use for recovering non-serializable XAResource

Let me make a quick side note to one interesting point in the processing. Interesting at least from the Narayana perspective.
If you are using Narayana transaction manager for some time you are well familiar with the log error message:

[com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016037: Could not find new XAResource to use for recovering non-serializable XAResource XAResourceRecord

This warning means: There was a successful prepared transaction as we can observe the record in the object store. But periodic recovery manager is not capable to find out what is the counterparty participant – e.g. what database or JMS broker the record belongs to.
This situation happens when the failure (JVM crash) happens in a specific time. That's time just after XAResource.commit is called. It makes the participant (the remote side - e.g. the database) to remove its knowledge about the transaction from its resource local storage. But at that particular point in time, the transaction record was not yet removed from the Narayana object store.
The JVM crash happened so after the application restarts the periodic recovery can observe a record in the object store. It tries to match such record to the information obtained from the participant's resource local storage (uses XAResource.recover call).
 
As the participant's resource local storage was cleaned there is no information obtained. Now the periodic recovery does see any directly matching information to its record in the object store.
From that said, we can see the periodic recovery complains that there is a participant record which does not contain "connection data" as it's non-serializable. And there is no matching record at the participant's resource local storage.

NOTE: One possibility to get rid of the warning in the log would be to serialize all the information about the participant (serializing the XAResource). Such serialized participants provide an easy way for the periodic recovery manager to directly call methods on the un-serialized instance (XAResource.recover). But it would mean to serialize e.g. the JDBC connection which is hardly possible.

The description above explains the JTA behaviour. In the case of the JTS, if the transaction manager found a record in the object store which does not match any participant's resource local storage info then the object store record is considered as assumed completed. Such consideration means changing the type of record in the object store. Changing the type means moving the record to a different place in the hierarchical structure of the object store. When the record is moved to an unknown place for the periodic recovery it stops to consider it as a problematic one and it stops to print out warnings to the application log. The record is then saved under ArjunaTransactionImple/AssumedCompleteServerTransaction in the hierarchical structure.
This conversion of the in-doubt record to the assumed completed one happens by default after 3 cycles of recovery. Changing the number of cycles could be done by providing system property -DJTSEnvironmentBean.commitedTransactionRetryLimit=…

The ARJUNA016037 the warning was a topic in various discussions

The warning is shown again and again in the application log. It's shown each time the periodic recovery is running – as it informs there is a record and I don't know what to do with that.

NOTE: The periodic recovery runs by default every 2 minutes.

Now, what we can do with that?


Fortunately, there is an enhancement of the recovery processing in the Narayana for some time already. When the participant driver (ie. resource manager "deployed" in the same JVM) implements the Narayna SPI XAResourceWrapper it provides the information what resource is the owner of the participant record. Narayana periodic recovery is then capable to deduce if the orphaned object store record belongs to the particular participant's resource local storage. Then it can assume that the participant committed already its work. Narayana can update its own object store and periodic recovery stops to show the warnings.
An example of the usage of the SPI is in the Active MQ Artemis RA.

Transaction processing failures

Back to the transaction processing failures (JVM crash, network failure, internal participant error).
As mentioned the "clean failures" can be automatically handled by the periodic recovery. But the "clean" failures are not the only ones you can experience. The XA protocol permits a heuristic failure. Those are failures which occurs when the participant does not follow the XA protocol. Such failures are not automatically recoverable by periodic recovery. Human intervention is needed.
 
Such failures occur mostly because of an internal error at the remote participant. An example of such failure could be that the transaction manager commands the resource to commit with XAResource.commit call. But the resource manager responds that it already rolled-back the resource transaction arbitrarily. In such a case, Narayana saves this unexpected state into the object store. The transaction is marked having the heuristic outcome. And the periodic recovery observes the heuristic record in the object store and informs about it during each cycle.
Now, it's the responsibility of the administrator to get an understanding of the transaction state and handle it.
But if he does not process such a transaction for a very long time then...

Expiry scanners

...then we are back at the track to the expiry scanners.
What does mean that a record stays in the object for a very long time?

The "very long time" is by default 12 hours for Narayana. It's the default time after when the garbage collection process starts. This garbage collection is the responsibility of the expiry scanners. The purpose is cleaning the object store from the long staying records. When there is a record left in the heuristic state for 12 hours in the object store or there is a record without the matching participant's resource local storage info in the object store then the expiry scanner handles it. The purpose of such handling causes is the periodic recovery stops to observe the existence of such in-doubt participant and subsequently to stop complaining about the existence of the record.

Handling a record means moving a record to a different place (changing the type of the record and placing the record to a different place in the hierarchical structure) or removing the record completely from the object store.

Available implementations of the expiry scanner

For the JTA transaction types, there are following expiry scanners available in Narayana
  • AtomicActionExpiryScanner : moving records representing the prepared transaction (AtomicAction) to the inferior hierarchy place named /Expired.
  • ExpiredTransactionStatusManagerScanner : removing records about connection setup for the status manager. This record is not connected with transaction processing and represents Narayana runtime data.

For the JTS transaction types, there are following expiry scanners available in Narayana
  • ExpiredToplevelScanner Removing ArjunaTransactionImple/AssumedCompleteTransaction record from the object store. The AssumedCompleteTransaction originates from the type ArjunaTransactionImple and is moved to the assumed type by the JTS periodic recovery processing.
  • ExpiredServerScanner Removing ArjunaTransactionImple/AssumedCompleteServerTransaction record from the object store. The AssumedCompleteServerTransaction originates from the type ArjunaTransactionImple/ServerTransaction/JCA and is moved to the assumed type by the JTS periodic recovery processing.
  • ExpiredContactScanner : Scanner removes the records which let the recovery manager know what Narayana instance belongs to which JVM. This record is not connected with transaction processing and represents Narayana runtime data.

Setup of expiry scanners classes

As explained elsewhere Narayana can be set up either with system properties passed directly to the Java program or defined in the file descriptor jbossts-properties.xml. If you run the WildFly application server the system properties can be defined at the command line with -D… when starting application server with standalone.sh/bat script. Or they can be persistently added into the bin/standalone.conf config file.
The class names of the expiry scanners that will be active after Narayana initialization can be defined by property com.arjuna.ats.arjuna.common.RecoveryEnvironmentBean.expiryScannerClassNames or RecoveryEnvironmentBean.expiryScannerClassNames (named differently, doing the same service). The property then contains the fully qualified class names of implementation of ExpiryScanner interface. The class names are separated with space or an empty line.
An example of such settings could be seen at Narayana quickstarts. Or when it should be defined directly here it's
-DRecoveryEnvironmentBean.expiryScannerClassNames="com.arjuna.ats.internal.arjuna.recovery.ExpiredTransactionStatusManagerScanner com.arjuna.ats.internal.arjuna.recovery.AtomicActionExpiryScanner"

NOTE: when you configure the WildFly app server then you are allowed to use only the shortened property name of -DRecoveryEnvironmentBean.expiryScannerClassNames=…. The longer variant does not work because of the way the issue WFLY-951 was implemented.

NOTE2: when you are running the WildFly app server then the expired scanners enabled by default could be observed by looking into the source code at ArjunaRecoveryManagerService (consider variants for JTA and JTS modes).

Setup of expiry scanners interval

To configure the time interval after the "orphaned" record is handled as the expired one you can use the property the property with the name com.arjuna.ats.arjuna.common.RecoveryEnvironmentBean.expiryScanInterval or RecoveryEnvironmentBean.expiryScanInterval. The value could be a positive whole number. Such number defines that the records expire after that number of hours. If you define the value as a negative whole number then the first run of the expire scanner run skipped. Next run of the expire scanner expires the records after that (positive) number of hours. If you define the value to be 0 then records are never handled by expiry scanners.


That's all in terms of this article. Feel free to ask a question here or at our forum at https://developer.jboss.org/en/jbosstm.

Monday, April 29, 2019

JTA and CDI integration

The Narayana release 5.9.5.Final comes with few nice CDI functionality enhancements. This blogpost introduces these changes while placing them to the context of the JTA and CDI integration, particularly with focus to Weld.

TL;DR

The fastest way to find out the way of using the JTA with the CDI is walking through the Narayana CDI quickstart.

JTA and CDI specifications

JTA version 1.2 was published in 2013. The version introduced the integration of JTA with CDI. The specification came with the definition of annotations javax.transaction.Transactional and javax.transaction.TransactionScoped. Those two provide a way for transaction boundary definition and for handling application data bounded to the transaction.

Narayana, as the implementation of the JTA specification, provides those capabilities in the CDI maven module.
Here we come with the maven coordinates:
<groupId>org.jboss.narayana.jta</groupId>
<artifactId>cdi</artifactId>

The module brings Narayana CDI extension to the user's project. The extension installs interceptors which manage transactional boundaries for method invocation annotated with @Transactional. Then the extension defines a transaction scope declared with the @TransactionScoped annotation.

On top of the functionality defined in the JTA specification, it's the CDI specification which defines some more transaction-related features. They are the transactional observer methods and the definition of the javax.transaction.UserTransaction built-in bean.

Let's summarize what that all means in practice.

@Transactional

With the use of the @Transactional annotation, transaction boundary could be controlled declaratively. The use of the annotation is really similar to the container-managed transactions in EJB.

When the annotation is used for a bean or a method the Narayana CDI extension (CDI interceptor is used) verifies the existence of the transaction context when the method is called. Based on the value of the value parameter an appropriate action is taken. The value is defined from enumeration Transactional.TxType
For example when @Transactional(Transactional.TxType.REQUIRES_NEW) is used on the method then on the start of its execution a new transaction is started. If the incoming method call contains an existing transaction it's suspended during the method execution and then resumed after it finishes. For details about the other Transactional.TxType values consider the javadoc documentation.

NOTE: be aware of the fact that for the CDI container can intercept the method call the CDI managed instance has to be used. For example, when you want to use the capability for calling an inner bean you must use the injection of the bean itself.

@RequestScope
public class MyCDIBean {
  @Inject
  MyCDIBean myBean;

  @Transactional(TxType.REQUIRED)
  public void mainMethod() {
    // CDI container does not wrap the invocation
    // no new transaction is started
    innerFunctionality();

    // CDI container starts a new transaction
    // the method uses TxType.REQUIRES_NEW and is called from the CDI bean
    myBean.innerFunctionality();
  }

  @Transactional(TxType.REQUIRES_NEW)
  private void innerFunctionality() {
    // some business logic
  }
}
  
>

@TransactionScoped

@TransactionScoped brings an additional scope type in addition to the standard built-in ones. A bean annotated with the @TransactionScoped, when injected, lives in the scope of the currently active transaction. The bean remains bound to the transaction even when it is suspended. On resuming the transaction the scoped data are available again. If a user tries to access the bean out of the scope of the active transaction the javax.enterprise.context.ContextNotActiveException is thrown.

Built-in UserTransaction bean

The CDI specification declares that the Java EE container has to provide a bean for the UserTransaction can be @Injected. Notice that the standalone CDI container has no obligation to provide such bean. The availability is expected for the Java EE container. In Weld, the integration for the Java EE container is provided through the SPI interface TransactionServices.

If somebody wants to use the Weld integrated with Narayana JTA implementation in a standalone application he needs to implement this SPI interface (see more below).

Transaction observer methods

The feature of the transaction observer methods allows defining an observer with the definition of the during parameter at @Observes annotation. During takes a value from the TransactionPhase enumeration. The during value defines when the event will be delivered to the observer. The event is fired during transaction processing in the business logic but then the delivery is deferred until transaction got status defined by the during parameter.
The during parameter can obtain values BEFORE_COMPLETION, AFTER_COMPLETION, AFTER_FAILURE, AFTER_SUCCESS. Using value IN_PROGRESS means the event is delivered to observer immediately when it's fired. It behaves like there is no during parameter used.

The implementation is based on the registration of the transaction synchronization. When the event is fired there is registered a special new synchronization which is invoked by the transaction manager afterwards. The registered CDI synchronization code then manages to launch the observer method to deliver the event.

For the during parameter working and for the events being deferred Weld requires integration through the TransactionServices SPI. The interface defines a method which provides makes for Weld possible to register the transaction synchronization. If the integration with the TransactionServices is not provided then the user can still use the during parameter in his code. But(!) no matter what TransactionPhase value is used the event is not deferred but it's immediately delivered to the observer. The behaviour is the same as when the IN_PROGRESS value is used.

Maybe it could be fine to clarify who fires the event. The event is fired by the user code. For example, take a look at the example in the Weld documentation. The user code injects an event and fires it when considers it necessary.

@Inject @Any Event productEvent;
...
public void persist(Product product) {
  em.persist(product);
  productEvent.select(new AnnotationLiteral(){}).fire(product);
}
The observer is defined in the standard way and using during for the event delivery to be deferred until the time the transaction is finished with success.
void addProduct(@Observes(during = AFTER_SUCCESS) @Created Product product) {
...
}

A bit more about TransactionServices: Weld and JTA integration

As said for the integration of the Weld CDI to JTA it's needed to implement the TransactionServices SPI interface. The interface gives the Weld the chance to gain the UserTransaction thus the built-in bean can provide it when it's @Injected. It provides the way to register transaction synchronization for an event could be deferred until particular transaction status occurs. Up to that, it demands the implementation of the method isTransactionActive. The TransactionScoped is active only when there is some active transaction. This way the Weld is able to obtain the transaction activity state.

Regarding the implementation, you can look at how the interface TransactionServices is implemented in WildFly or in the more standalone way for SmallRye Context Propagation.

A new Narayana CDI features

Narayana brings two new CDI JTA integration capabilities, up to those described above.

The first enhancement is the addition of the transactional scope events. Up to now, Narayana did not fire the scope events for the @TransactionScoped. From now there is fired the scope events automatically by Narayana. The user can observe the initialization and the destruction of the transaction scope. The code for the observer could be like

void transactionScopeActivated(
  @Observes @Initialized(TransactionScoped.class) final Transaction event,
  final BeanManager beanManager) {
...
}
The event payload for the @Initialized is the javax.transaction.Transaction, for the @Destroyed is just the java.lang.Object (when the transaction scope is destroyed there is no active transaction anymore).
As the Narayana implements the CDI in version 1.2 in these days there is not fired an event for @BeforeDestroy. That scope event was introduced in the CDI version 2.0.

The second enhancement is the addition of two built-in beans which can be @Injected in the user code. Those are beans TransctionManager and TransactionSynchronizationRegistry.

The implementation gives priority to the JNDI binding. If there is bound TransactionManager/TransactionSynchronizationRegistry in the JNDI then such instance is returned at the injection point.
If the user defines his own CDI bean or a CDI producer which provides an instance of those two classes then such instance is grabbed for the injection.
As the last resort, the default Narayana implementation of both classes is used. You can consider the TransactionManagerImple and the TransactionSynchronizationRegistryImple to be used.

Using the transactional CDI extension

The easiest way to check the integration in the action is to run our JTA standalone quickstart. You can observe the implementation of the Weld SPI interface TransactionServices. You can check the use of the observers, both the transaction observer methods and the transactional scoped observers. Up to that, you can see the use of the transaction scope and use of the injection for the TransactionManager.

Acknowledgement

Big thanks to Laird Nelson who contributed the new CDI functionality enhancements to Narayana.
And secondly thanks to Matěj Novotný. for his help in understanding the CDI topic.

Friday, October 19, 2018

Narayana integration with Agroal connection pool

Project Agroal defines itself as “The natural database connection pool”. And that’s what is it.

It was developed by Luis Barreiro. He works for WildFly as a performance engineer. This prefigures what you can expect – a well performing database connection pool. As Agroal comes from the porfolio of the WildFly projects it offers smooth integration with WildFly and with Narayana too.

In the previous posts we checked other connection pools that you can use with Narayana - either the transactional driver provided by Narayana or DBCP2 which is nicely integrated to be used with Narayana in Apache Tomcat. Another option is the use of the IronJacamar which lives in the long-termed brotherhood with Narayana. All those options are nicely documented in our quickstarts.

Agroal is a party member and you should consider to check it. Either when running standalone application with Narayana or when you run on WildFly. Let’s take a look how you can use it in the standalone application first.

Agroal with Narayana standalone

In case you want to use the Agroal JDBC pooling capabilities with Narayana in your application you need to configure the Agroal datasource to know

  • how to grab the instance of the Narayana transaction manager
  • where to find the synchronization registry
  • how to register resources to Narayana recovery manager

Narayana setup

First we need to gain all the mentioned Narayana objects which are then passed to Agroal which ensures the integration by calling the Narayana API at appropriate moments.

// gaining the transction manager and synchronization registry
TransactionManager transactionManager
    = com.arjuna.ats.jta.TransactionManager.transactionManager();
TransactionSynchronizationRegistry transactionSynchronizationRegistry
    = new com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionSynchronizationRegistryImple();

// intitialization of recovery manager
RecoveryManager recoveryManager
    = com.arjuna.ats.arjuna.recovery.RecoveryManager.manager();
recoveryManager.initialize();
// recovery service provides binding for hooking the XAResource to recovery process
RecoveryManagerService recoveryManagerService
    = new com.arjuna.ats.jbossatx.jta.RecoveryManagerService();
recoveryManagerService.create();

Agroal integration

Now we need to pass the Narayana's object instances to Agroal. With that being done we can obtain a JDBC Connection which is backed by the transaction manager.


AgroalDataSourceConfigurationSupplier configurationSupplier
    = new AgroalDataSourceConfigurationSupplier()
  .connectionPoolConfiguration(cp -> cp
    .transactionIntegration(new NarayanaTransactionIntegration(
        transactionManager, transactionSynchronizationRegistry,
        "java:/agroalds1", false, recoveryManagerService))
      cf.connectionFactoryConfiguration(cf ->
        .jdbcUrl("jdbc:h2:mem:test")
        .principal(new NamePrincipal("testuser"))
        .credential(new SimplePassword("testpass"))
        .recoveryPrincipal(new NamePrincipal("testuser"))
        .recoveryCredential(new SimplePassword("testpass"))
        .connectionProviderClassName("org.h2.jdbcx.JdbcDataSource"))
      .maxSize(10)
    );
AgroalDataSource ds1 = AgroalDataSource.from(configurationSupplier);

transactionManager.begin();

conn1 = ds1.getConnection();
...

Those are steps needed for the standalone application to use Narayana and Agroal. The working code example could be seen in the Narayana quickstart at github.com/jbosstm/quickstart#agroal - AgroalDatasource.java

Agroal XA datasource in WildFly

If you want to use the power of Narayana in the WildFly application you need XA participants that Narayana can drive. From Agroal perspective you need to define a xa datasource which you use (linked via JNDI name) in your application.

DISCLAIMER: for you can use the Agroal capabilities integrated with Narayana you will need to run WildFly 15 or later. Currently only WildFly 14 is available so for testing this you need to build the WildFly from sources by yourself. The good message is that’s an easy task – see at https://github.com/wildfly/wildfly/#building.

Agroal datasource subsystem is not available by default in the standalone.xml file so you need to enable that extension. When you run the jboss cli commands then you do it like this

cd $JBOSS_HOME
./bin/jboss-cli.sh -c

#  jboss-cli is started, run following command there
/extension=org.wildfly.extension.datasources-agroal:add
/subsystem=datasources-agroal:add()
:reload

From now you can work with the datasources-agroal subsystem. For you can create the xa-datasource definition you need to have a driver which the datasource will use. The driver has to define it’s XA connection provider.

NOTE: if you want to check what are options for the Agroal configuration in the jboss cli then read the resource description with command /subsystem=datasources-agroal:read-resource-description(recursive=true)

Agroal driver definition works only with drivers deployed as modules. You can’t just copy the driver jar to $JBOSS_HOME/standalone/deployments directory but you need to create a module under $JBOSS_HOME/modules directory. See details either by creating module.xml by yourself or the recommended way is using the jboss cli with command

module add --name=org.postgresql
    --resources=/path/to/jdbc/driver.jar --dependencies=javax.api,javax.transaction.api

NOTE: The command uses the name of the module org.postgresql as I will demonstrate adding the xa datasource for the PostgreSQL database.

When the module is added we can declare the Agroal’s driver.

/subsystem=datasources-agroal/driver=postgres:add(
    module=org.postgresql, class=org.postgresql.xa.PGXADataSource)

We’ve used the class org.postgresql.xa.PGXADataSource as we want to use it as XA datasource. When class is not defined then standard jdbc driver for PostgresSQL is used (org.postgresql.Driver) as declared in the META-INF/services/java.sql.Driver file.

NOTE: If you would declare the driver without the XA datasource being defined and then you try to add it to XA datasource definition you will get an error

/subsystem=datasources-agroal/driver=non-xa-postgres:add(module=org.postgresql)
/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:add(
    connection-factory={driver=non-xa-postgres},...)
{
    "outcome" => "failed",
    "failure-description" => {"WFLYCTL0080: Failed services" => {"org.wildfly.data-source.AgroalPostgresql"
        => "WFLYAG0108: An xa-datasource requires a javax.sql.XADataSource as connection provider. Fix the connection-provider for the driver"}
},
    "rolled-back" => true
}

When the JDBC driver module is defined we can create the Agroal XA datasource. The bare minimum of attributes you have to define is shown in the following command

/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:add(
    jndi-name=java:/AgroalPostgresql, connection-pool={max-size=10}, connection-factory={
    driver=postgres, username=test, password=test,url=jdbc:postgresql://localhost:5432/test})

NOTE: this is the most simple way of define the credentials for the connection to database. If you consider more sophisticated method, than just username/password as clear strings saved in the standalone.xml, take a look at the Elytron capabilities.

To check if the WildFly Agroal datasource is able to connect to the database you can use test-connection command

/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:test-connection()

If you are insterested how the configuration looks as a xml element in standalone.xml configuration file then the Agroal subsystem with PostgreSQL XA datasource definition would look like

<subsystem xmlns="urn:jboss:domain:datasources-agroal:1.0">
    <xa-datasource name="AgroalPostgresql" jndi-name="java:/AgroalPostgresql">
        <connection-factory driver="postgres" url="jdbc:postgresql://localhost:5432/test"
            username="test" password="test"/>
        <connection-pool max-size="10"/>
    </xa-datasource>
    <drivers>
        <driver name="postgres" module="org.postgresql" class="org.postgresql.xa.PGXADataSource"/>
    </drivers>
</subsystem>

If you want use the Agroal non-xa datasource as commit markable resource (CMR) it’s possible too. You need to create a standard datasource and define it as connectable. For more information what the commit markable resource means and how it works check our previous blogpost about CMR.

<subsystem xmlns="urn:jboss:domain:datasources-agroal:1.0">
    <datasource name="AgroalPostgresql" connectable="true" jndi-name="java:/AgroalPostgresql"
            statistics-enabled="true">
        <connection-factory driver="postgres" url="jdbc:postgresql://localhost:5432/test"
            username="test" password="test"/>
        <connection-pool max-size="10"/>
    </datasource>
    <drivers>
        <driver name="postgres" module="org.postgresql" class="org.postgresql.Driver"/>
    </drivers>
</subsystem>

NOTE: In addition to this configuration of Agroal datasource you need to enable the CMR in the transaction subsystem too – check the blogpost for detailed info.

Summary

This blogpost showed way how to configure Agroal JDBC pooling library and how to integrate it with Narayana.
The code example is part of the Narayana quickstart and you can check it at https://github.com/jbosstm/quickstart/tree/master/agroal

Sunday, September 9, 2018

Tips on how to evaluate STM implementations

Software Transactional Memory (STM) is a way of providing transactional behaviour for threads operating on shared memory. The transaction is an atomic and isolated set of changes to memory such that prior to commit no other thread sees the memory updates and after commit the changes appear to take effect instantaneously so other threads never see partial updates but on abort all of the updates are discarded.

Unlike other models such as XA, OTS, JTA, WS-AT etc, with STM there is no accepted standard for developers to program against. Consequently the various implementations of STM differ in important respects which have consequences for how application developers build their software. I recently came upon an excellent book on Transactional Memory where the authors James Larus and Ravi Rajwar presented a taxonomy of features and characteristics that can be used to differentiate the various STM implementations from each other. In this and subsequent blogs I will explain the taxonomy and identify where the Narayana STM solution (which was introduced in Mark Little's initial blog on the topic) fits into it. Towards the end of the series I will include some tips, best practices and advice on how you can get the most out of the Narayana implementation of STM.

In this first article I will cover isolation, nesting and exception handling. In later articles I will discuss topics such as conflict detection and resolution, transaction granularity, concurrency control etc.

By way of motivation, why would one want to use STM in favour of other transaction models and concurrency control mechanisms:
  • The STM approach of mutating data inside of a transaction has some nice features:
    • It is less error prone since the demarcation of an atomic block of code is primitive but other synchronisation approaches are many and varied. Techniques such as locks, semaphores, signals etc are tricky to get right, for example the programmer must ensure that accesses are protected with the correct locks and in the correct order. With conventional concurrency control, imagine trying reverse all the changes made during a computation if a problem such as deadlock or data race is detected, whereas code changes that are protected by STM can be aborted in a single statement.
    • Transactional updates make it easier for the programmer to reason about his code (it is clear how different threads affect each other) and data (because it simplifies the sharing of state between threads).
  • The declarative approach (where the programmer simply marks which code blocks are transactional) means concurrent programming is more intuitive with no explicit locks or synchronisation to worry about.
  • Can perform much better than fine grained locking (which can lead to deadlock) and coarse grained locking (which inhibits concurrency):
    • If a thread takes a lock and is context switched or incurs cache misses or page faults etc then other threads that need the lock are stalled until the thread is rescheduled or until the needed data is retrieved.
    • With STM, updates can be batched up and speculatively committed together.
    • The runtime manages lock acquisition and release and resolves conflicts (using approaches such as timeouts and retries).
  • It is easier to compose operations using a technique called nesting (traditionally composing two operations can produce concurrency problems unless one analyses in detail the locking approach used by those operations).

Properties of a STM system

In the following I will describe the design choices available to STM systems in general and in particular I will illustrate the choices made by the Narayana STM implementation using code examples. The examples will be made available in the Narayana STM test suite so that you can also experiment with the particular properties of the implementation. Each of the examples will be using the same transactional object which is defined as follows:

    @Transactional
    public interface AtomicInt {
        int get() throws Exception;
        void set(int value) throws Exception;
    }

    public class AtomicIntImpl implements AtomicInt {
        private int state;

        @ReadLock
        public int get() throws Exception {
            return state;
        }

        @WriteLock
        public void set(int value) throws Exception {
            state = value;
        }
    }

The @Transactional annotation on the AtomicInt interface tells the system that instances of the interface are candidates to be managed by the STM system. The implementation of the interface defines a pair of methods for reading and writing the the shared state (by default all state is tracked but this default can be overridden via the @NotState annotation).

Property 1: Interaction with non transactional code

If uncommitted transactional memory updates are visible to non-transactional code and vice-versa (i.e. updates made by non-transactional code are visible to running transactions) then the isolation model is said to be weak. On the other hand if non-transactional accesses are upgraded to a transactional access then the model is said to be strong.

The weak access model, although common, can lead to data races. A data race occurs if two threads T1 and T2 access memory, T1 for writing, say, and the other for reading then the value of the memory read is indeterminate. If, for example T1 writes data inside a transaction and T2 reads that data, then if T1 aborts but T2 has made a decision based on the value it read then we have an incorrect program since aborted transactions must not have side effects (recall the "all or nothing" characteristic of atomicity).

Narayana STM follows the weak isolation model. The following test updates shared memory inside a transaction and then triggers a thread to perform non-transactional reads and writes on it while the transaction is still running. The test shows that the two threads interfere with each other producing indeterminate results:
    public void testWeakIsolation() throws Exception {
        AtomicIntImpl aiImple = new AtomicIntImpl();
        // STM is managed by Containers. Enlisting the above implementation
        // with the container returns a proxy which will enforce STM semantics
        AtomicInt ai = new RecoverableContainer().enlist(aiImple);
        AtomicAction tx = new AtomicAction();

        // set up the code that will access the memory outside of a transaction
        Thread ot = new Thread(() -> {
            try {
                synchronized (tx) {
                    tx.wait(); // for the other thread to start a transaction

                    // weak isolation implies that this thread (which is running
                    // outside of a transaction) can observe transactional updates
                    assertEquals(2, aiImple.get()); // the other thread set it to 2
                    aiImple.set(10); // this update is visible to transactional code
            } catch (Exception e) {
                e.printStackTrace();
            }
        });

        ot.start();

        ai.set(1); // initialise the shared memory
        tx.begin(); // start a transaction
        {
            ai.set(2); // conditionally set the value to 2

            synchronized (tx) {
                tx.notify(); // trigger non-transactional code to update the memory
            }

            // weak isolation means that this transactional code may see the
            // changes made by the non transactional code
            assertEquals(10, ai.get()); // the other thread set it to 10
            tx.commit(); // commit the changes made to the shared memory
        }

        // changes made by non transactional code are still visible after commit
        assertEquals(10, ai.get());
        assertEquals(aiImple.get(), ai.get());
    }

As an aside, notice in this example that the code first had to declare the shared data using the @Transactional annotation and then had to access it via a proxy returned from a RecoverableContainer. Some systems introduce new keywords into the language that demarcate the atomic blocks and in such systems any memory updates made by the atomic block would be managed by the STM implementation. That type of system takes some of the burden of ensuring correctness away from the programmer but are harder to implement (for example a common technique requires compiler extensions).

Property 2: Nested transactions

A nested transaction (the child) is one that is started in the context of an outer one (the parent). The child sees the changes made by the parent. Aborting the parent will abort each child. A parent that does not have any parents is called top level.

The effects of committing/aborting either transaction (the child or parent) and the visibility of changes depend upon which model is being used:

Flattened:

  • The parent and child transactions see each others updates.
  • If the child aborts the parent aborts too.
  • Changes made by the child only become visible to other threads when the parent commits
Pros - easy to implement
Cons - breaks composition (if the child aborts it causes all work done by the parent transaction to abort)

Closed Nested

  • Changes are hidden from the parent transaction (and from other transactions) until the child commits, at which time any changes made by the child become part of the parent transactions' set of updates (therefore, in contrast to open nested transactions, other transactions will not see the updates until the parent commits);
  • aborting the child does not abort the parent;
Pros - Is arguably the most natural model for application designers

Open Nested

  • When the child transaction commits, all other transactions see the updates even if the parent aborts which is useful if we want unrelated code to make permanent changes during the transaction even if the parent aborts.
Pros - enables work to be made permanent even if the parent aborts (for example logging code made by the child)

Narayana STM follows the closed model as is demonstrated by the following test case:
    public void testIsClosedNestedCommit() throws Exception {
        AtomicInt ai = new RecoverableContainer().enlist(new AtomicIntImpl());
        AtomicAction parent = new AtomicAction();
        AtomicAction child = new AtomicAction();

        ai.set(1); // initialise the shared memory
        parent.begin(); // start a top level transaction
        {
            ai.set(2); // update the memory in the context of the parent transaction
            child.begin(); // start a child transaction
            {
                ai.set(3); // update the memory in a child transaction
                // NB the parent would still see the value as 2
                // (not shown in this test)
                child.commit();
            }
            // since the child committed the parent should see the value as 3
            assertEquals(3, ai.get());
            // NB other transactions would not see the value 3 however until
            // the parent commits (not demonstrated in this test)
        }
        parent.commit();

        assertEquals(3, ai.get());
    }

Isolation amongst child transactions

The concept of isolation applies to nested transactions as well as to top level transactions. It seems most natural for siblings to use the same model as is used for isolation with respect to other transactions (ie transactions that are not in ancestor hierarchy of a particular child). For example the CORBA Object Transaction Service (OTS) supports the closed model and children do not see each others updates until the parent commits.

Property 3: Exception Handling

On exception the options are to either terminate or ignore the exception or to use a mixture of both where the programmer tells the system which exceptions should abort and which ones should commit the transaction which is similar to what the JTA 1.2 spec provides with its rollbackOn and dontRollbackOn annotation attributes.

The Narayana STM implementation takes the view that the programmer is best placed to make decisions about what to do under exceptional circumstances. The following test demonstrates this behaviour:
    public void testExceptionDoesNotAbort() throws Exception {
        AtomicInt ai = new RecoverableContainer().enlist(new AtomicIntImpl());
        AtomicAction tx = new AtomicAction();

        ai.set(1);
        tx.begin();
        {
            try {
                ai.set(2);
                throw new Exception();
            } catch (Exception e) {
                assertEquals(2, ai.get());
                // the transaction should still be active
                ai.set(3);
                tx.commit();
            }
        }

        assertEquals(3, ai.get());
    }

What's Next

That's all for this week. In the next instalment I will cover conflict detection and resolution, transaction granularity and concurrency control.

Thursday, June 28, 2018

Narayana Commit Markable Resource: a faultless LRCO for JDBC datasources

CMR is neat Narayana feature enabling full XA transaction capability for one non-XA JDBC resource. This gives you a way to engage a database resource to XA transaction even the JDBC driver is not fully XA capable (or you just have a design restriction on it) while transaction data consistency is kept.

Last resource commit optimization (aka. LRCO)

Maybe you will say "adding one non-XA resource to a transaction is well-known LRCO optimization". And you are right. But just partially. The last resource commit optimization (abbreviated as LRCO) provides a way to enlist and process one non-XA datasource to the global transaction managed by the transaction manager. But LRCO contains a pitfall. When the crash of the system (or the connection) happens in particular point of the time, during two-phase commit processing, it causes data inconsistency. Namely, the LRCO could be committed while the rest of the resources will be rolled-back.

Let's elaborate a bit on the LRCO failure. Let's say we have a JMS resource where we send a message to a message broker and non-XA JDBC datasource where we save information to the database.

NOTE: The example refers to the Narayana two-phase commit implemenation.

  1. updating the database with INSERT INTO SQL command, enlisting LRCO resource under the transaction
  2. sending a message to the JMS broker, enlisting the JMS resource to the transaction
  3. Narayana starts the two phase commit processing
  4. prepare is called to JMS XA resource, the transaction log is stored at the JMS broker side
  5. prepare phase for the LRCO means to call commit at the non-XA datasource. That call makes the data changes visible to the outer world.
  6. crash of the Narayana JVM occurs before the Narayana can preserve information of commit to its transaction log store
  7. after the Narayana restarts there is no notion about the existence of any transaction thus the prepared JMS resource is rolled-back during transaction recovery

Note: roll-backing of the JMS resource is caused by presumed abort strategy applied in the Narayana. If transaction manager does do not apply the presumed abort then you end ideally not better than in the transaction heuristic state.

The LRCO processing is about ordering the LRCO resource as the last during the transaction manager 2PC prepare phase. At place where transaction normally calls prepare at XAResources there is called commit at the LRCO's underlaying non-XA resource.
Then during the transaction manager commit phase there is called nothing for the LRCO.

Commit markable resource (aka. CMR)

The Commit Markable Resource, abbreviated as CMR, is an enhancement of the last resource commit optimization applicable on the JDBC resources. The CMR approach achieves capabilities similar to XA by demanding special database table (normally named xids) that is accessible for transaction manager to write and to read via the configured CMR datasource.

Let's demonstrate the CMR behavior at the example (reusing setup from the previous one).

  1. updating the database with INSERT INTO SQL command, enlisting the CMR resource under the transaction
  2. sending a message to the JMS broker, enlisting the JMS resource to the transaction
  3. Narayana starts the two phase commit processing
  4. prepare on CMR saves information about prepare to the xids table
  5. prepare is called to JMS XA resource, the transaction log is stored at the JMS broker side
  6. commit on CMR means calling commit on underlaying non-XA datasource
  7. commit on JMS XA resource means commit on the XA JMS resource and thus the message being visible at the queue, the proper transaction log is removed at the JMS broker side
  8. Narayana two phase commit processing ends

From what you can see here the difference from the LRCO example is that the CMR resource is not ordered as last in the resource processing but it's ordered as the first one. The CMR prepare does not mean committing the work as in case of the LRCO but it means saving information about that CMR is considered to be prepared into the database xids table.
As the CMR is ordered as the first resource for processing it's taken as first during the commit phase too. The commit call then means to call commit at the underlying database connection. The xids table is not cleaned at that phase and it's normally responsibility of CommitMarkableResourceRecordRecoveryModule to process the garbage collection of records in the xids table (see more below).

The main fact to understand is that CMR resource is considered as fully prepared only after the commit is processed (meaning commit on the underlaying non-XA JDBC datasource). Till that time the transaction is considered as not prepared and will be processed with rollback by the transaction recovery.

NOTE: the term fully prepared considers the standard XA two-phase commit processing. If the transaction manager finishes with the prepare phase, aka. prepare is called on all transaction participants, the transaction is counted as prepared and commit is expected to be called on each participant.

It's important to note that the correct processing of failures in transactions which contain CMR resources is responsibility of the special periodic recovery module CommitMarkableResourceRecordRecoveryModule. It has to be configured as the first in the recovery module list as it needs to check and eventually process all the XA resources belonging to the transaction which contains the CMR resource (the recovery modules are processed in the order they were configured). You can check here how this is set up in WildFly.
The CMR recovery module knows about the existence of the CMR resource from the record saved in the xids table. From that it's capable to pair all the resources belonging to the same transaction where CMR was involved.

xids: database table to save CMR processing data

As said Narayana needs a special database table (usually named xids) to save information that CMR was prepared. You may wonder what is content of that table.
The table consists of three columns.

  • xid : id of the transaction branch belonging to the CMR resource
  • transactionManagerID : id of transaction manager, this serves to distinguish more transaction managers (WildFly servers) working with the same database. There is a strict rule that each transaction manager must be defined with unique transaction id (see description of the node-identifer).
  • actionuid : global transaction id which unites all the resources belonging to the one particular transaction

LRCO failure case with CMR

In the example, we presented as problematic for LRCO, the container crashed just before prepare phase finished. In such case, the CMR is not committed yet. The other transaction participants are then rolled-back as the transaction was not fully prepared. The CMR brings the consistent rollback outcome for all the resources.

Commit markable resource configured in WildFly

We have sketched the principle of the CMR and now it's time to check how to configure it for your application running at the WildFly application server.
The configuration consists of three steps.

  1. The JDBC datasource needs to be marked as connectable
  2. The database, the connectable datasource points to, has to be enriched with the xids table where Narayana can saves the data about CMR processing
  3. Transaction subsystem needs to be configured to be aware of the CMR capable resource

In our example, I use the H2 database as it's good for the showcase. You can find it in quickstart I prepared too. Check out the https://github.com/jbosstm/quickstart/tree/master/wildfly/commit-markable-resource.

Mark JDBC datasource as connectable

You will mark the resource as connectable when you use attribute connectable="true" in your datasource declaration in standalone*.xml configuration file. When you use jboss cli for the app server configuration you will use commands

/subsystem=datasources/data-source=jdbc-cmr:write-attribute(name=connectable, value=true)
:reload

The whole datasource configuration then looks like

<datasource jndi-name="java:jboss/datasources/jdbc-cmr" pool-name="jdbc-cmr-datasource"
          enabled="true" use-java-context="true" connectable="true">
  <connection-url>jdbc:h2:mem:cmrdatasource</connection-url>
  <driver>h2</driver>
  <security>
      <user-name>sa</user-name>
      <password>sa</password>
  </security>
</datasource>

When datasource is marked as connectable then the IronJacamar (JCA layer of WildFly) creates the datasource instance as implementing org.jboss.tm.ConnectableResource (defined in the jboss-transaction-spi project). This resource defines that the class provides method getConnection() throws Throwable. That's how the transaction manager is capable to obtain the connection to the database and works with the xids table inside it.

Xids database table creation

The database configured to be connectable has to ensure existence of the xids before transaction manager starts. As described above the xids allows to save the cruical information about the non-XA datasource during prepare. The shape of the SQL command depends on the SQL syntax of the database you use. The example of the table cleation commands is (see more commands under this link)

-- Oracle
CREATE TABLE xids (
  xid RAW(144), transactionManagerID VARCHAR(64), actionuid RAW(28)
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

-- PostgreSQL
CREATE TABLE xids (
  xid bytea, transactionManagerID varchar(64), actionuid bytea
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

-- H2
CREATE TABLE xids (
  xid VARBINARY(144), transactionManagerID VARCHAR(64), actionuid VARBINARY(28)
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

I addressed the need of the table definition in the CMR quickstart by adding the JPA schema generation create script which contains the SQL to initialize the database.

Transaction manager CMR configuration

The last part is to configure the CMR for the transaction subsystem. The declaration puts the datasource under the list JTAEnvironmentBean#commitMarkableResourceJNDINames which is then used in code of TransactionImple#createResource.
The xml element used in the transaction subsystem and the jboss cli commands look like

<commit-markable-resources>
  <commit-markable-resource jndi-name="java:jboss/datasources/jdbc-cmr"/>
</commit-markable-resources>
/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr":add()
:reload

CMR configuration options

In addition to such simple CMR declaration, the CMR can be configured with following parameters

  • jndi-name : as it could be seen above the jndi-name is way to point to the datasource which we mark as CMR ready
  • name : defines the name of the table which is used for storing the CMR state during prepare while used during recovery.
    The default value (and we've reffered to it in this way above) is xids
  • immediate-cleanup : If configured to true then there is registered a synchronization which removes proper value from the xids table immediatelly after the transaction is committed.
    When synchronization is not set up then the clean-up of the xids table is responsibility of the recovery by the code at CommitMarkableResourceRecordRecoveryModule. It checks about finished xids and it removes those which are free for garbage collection.
    The default value is false (using only recovery garbage collection).
  • batch-size : This parameter influences the process of the garbage collection (as described above). The garbage collection takes finished xids and runs DELETE SQL command. The DELETE contains the WHERE xid in (...) clause with maximum of batch-size entries provided. When there is still some finished xids left after deletion, another SQL command is assembled with maximum number of batch-size entries again.
    The default value is 100.

The commit-markable-resource xml element configured with all the parameters looks like

<subsystem xmlns="urn:jboss:domain:transactions:4.0">
  <core-environment>
      <process-id>
          <uuid/>
      </process-id>
  </core-environment>
  <recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/>
  <object-store path="tx-object-store" relative-to="jboss.server.data.dir"/>
  <commit-markable-resources>
      <commit-markable-resource jndi-name="java:jboss/datasources/jdbc-cmr">
          <xid-location name="myxidstable" batch-size="10" immediate-cleanup="true"/>
      </commit-markable-resource>
  </commit-markable-resources>
</subsystem>

And the jboss cli commands for the same are

/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=name, value=myxidstable)
  /subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=immediate-cleanup, value=true)
/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=batch-size, value=10)
:reload

NOTE: the JBoss EAP documentation about the CMR resource configuration can be found at section About the LRCO Optimization for Single-phase Commit (1PC)

Conclusion

The article explained what is the Narayana Commit Markable resource (CMR), it compared it with LRCO and presented its advantages. In the latter part of the article you found how to configure the CMR resource in your application deployed at the WildFly application server.
If you like to run an application using the commit markable resource feature, check our Narayana quickstart at https://github.com/jbosstm/quickstart/tree/master/wildfly/commit-markable-resource.