Let's talk about the transaction recovery with details specific to Narayana.
This blog post is related to JTA transactions. If you configure recovery for JTS,
still you can find a relevant information here but then you will need to consult
the Narayana documentation.
What is the transaction recovery
The transaction recovery is process needed when an active transaction fails
for some reason. It could be a crash of process of the transaction manager (JVM)
or connection to the resource (database)could fail or any other reason for failure.
The failure of the transaction could happen at various points of the transaction
lifetime and the point define the state which the in-progress transaction was left at.
The state could be just an in-memory state which is left behind and transaction
manager relies on the resource transaction timeout to release it. But it could
the transaction state after prepare was called (by successful prepare call
the 2PC transaction confirms that is capable to finish transaction and more of what it promises
to finish the transaction with commit
).
There has to be a process which finishes such transaction remainders. And that process is the transaction recovery.
Let's review three variants of failures which serves three different transaction states.
Their results and needs of termination will guide us through the work of the transaction recovery process.
Transaction manager runs a global transaction which includes several
transaction branches (when using term from XA specification).
In our article about 2PC we used (not precisely) term
resource-located transaction
instead of the transaction branch.
Let's say we have a global transaction containing data insertion to a database
plus sending a message to a message broker queue.
We will examine the crash of transaction manager (the JVM process)
where each point represents one of the three example cases. The examples show the
timeline of actions and differ in time when the crash happens.
-
The global transaction was started and insertion to database happened, now JVM crashes.
(no message was sent to queue). In this situation, all the transaction metadata
is saved only in the memory. After JVM is restarted the transaction manager has no notion
of existence of the global transaction in time before.
But the insertion to the database already happened and the database has some work in progress already. But there was no promise by prepare call to end with commit and everything was stored only as an in-memory state thus transaction manager relies on the database to abort the work itself. Normally it happens when transaction timeout expires. -
The global transaction was started, data was inserted into the database and message was
sent to the queue. The global transaction is asking to commit. The two-phase commit begins –
the
prepare
is called on the database (resource-located transaction). Now the transaction manager (the JVM) crashes. If an interleaving data manipulation would be permitted then the2PC commit
would fail. But calling ofprepare
means the promise of the successful end. Thus the call of prepare causes locks to be taken to prevent other transactions to interleave.
When transaction manager is restarted but again no notion of the transaction could be found as all the in-memory state was cleared. And there was nothing to be saved in the Narayana transaction log so far.
But the database transaction is in the prepared state and with locks. On top of it, the transaction in the prepared state can't be rolled-back by transaction timeout and needs to wait for some other party to finish it. -
The global transaction was started, data inserted into the database and message was
sent to the queue. The transaction was asked to commit. The two-phase commit
begins – the
prepare
is called on the database and on the message queue too. A record success of the prepare phase is saved to the Narayana transaction log too. Now the transaction manager (JVM) crashes.
After the transaction manager is restarted there is no in-memory state but we can observe the record in the Narayana transaction log and that database and the JMS queue resource-located transactions are in the prepared state with locks.
In the later cases the transaction state survived the JVM crash
- once only at the side of locked records of a database, in other cases, a record is present in transaction log too.
In the first case only in memory transaction representation was used
where transaction manager is not responsible to finish it.
The work of finishing the unfinished transactions belongs to the recovery manager
.
The purpose of the recovery manager is to periodically check the state of the Narayana transaction log
and resource transaction logs (unfinished resource-located transactions
– it runs the JTA API call of XAResource.recover().
If an in-doubt transaction is found
the recovery manager either roll it back - for example in the second case,
or commit it as the whole prepare phase was originally finished with success, see the third case.
Narayana periodic recovery in details
The periodic recovery is the configurable process. That brings flexibility of the usage but made necessary to use proper settings if you want to run it.We recommend checking the Narayana documenation, the chapter Failure Recovery.
The recovery runs periodicity (by default each two minutes) - the period could be changed by setting system property RecoveryEnvironmentBean.periodicRecoveryPeriod). When launched it iterates over all registered recovery modules (see Narayana codebase com.arjuna.ats.arjuna.recovery.RecoveryModule) and it runs the following sequence: calling the method
periodicWorkFirstPass
on all recovery modules, waiting time defined by
RecoveryEnvironmentBean.recoveryBackoffPeriod,
calling the method RecoveryEnvironmentBean.recoveryBackoffPeriod
on all recovery modules.
When you want to run standard JTA XA transactions (JTS differs, you can check the config example in the Narayana code base) then you needs to configure the XARecoveryModule for the usage. The XARecoveryModule then brings to the play need of configuring XAResourceOrphanFilters which manage finishing in-doubt transactions when available only at the resource side (the second case represents such scenario).
Narayana periodic recovery configuration
You may ask how all this is configured, right?The Narayana configuration is held in "beans". The "beans" contains properties which are retrieved by getter method calls all over the Narayana code. So configuration of the Narayana behaviour means redefining values of the demanded bean properties.
Let's check what are the beans relevant for setting the transaction management and recovery for XA transactions. We will use the jdbc transactional driver quickstart as an example.
The releavant beans are following
- CoreEnvironmentBean
- CoordinatorEnvironmentBean
- JTAEnvironmentBean
- RecoveryEnvironmentBean
- JDBCEnvironmentBean
To configure the values of the properties you need to define it one of the following ways
- via system property – see example
in the quickstart
pom.xml
. We can see that the property is passed at the JVM argument line. - via use of the descriptor file
jbossts-properties.xml
.
This is usually the main source of configuration in the standalone applications using Narayana. You can see the example jbossts-properties.xml and observe that as the standalone application is not the exception.
The descriptor has to be at the classpath for the Narayana will be able to access it. -
via call of bean setter methods.
This is the programatic approach and is normally used mainly in managed environment as they are application servers as WildFly is (WildFly configures with jboss-cli).
- The usage of the system property has precedence over use of the descriptor
jbossts-properties.xml
. - The usage of the programatic call of the setter method has precedence over use of system properties.
The default settings for the used narayana-idlj-jts.jar
artifact can be seen at
https://github.com/jbosstm/narayana/blob/master/ArjunaJTS/narayana-jts-idlj/src/main/resources/jbossts-properties.xml. Those are (with combination of settings inside of particular beans) default values used when you don't have any properties file defined.
For more details on configuration check the Narayana.io documentation (section Development Guide -> Configuration options).
If you want to use the programatic approach and call the bean setters you need to gain the bean instance first. That is normally done by calling a static method of
PropertyManager
. There are various of them
depending what you want to configure.The relevant for us are:
-
arjPropertyManager for
CoreEnvironmentBean
etc. -
recoveryPropertyManager for
RecoveryEnvironmentBean
-
jdbcPropertyManager for
JDBCEnvironementBean
We will examine the programmatic approach at the example of the jdbc transactional driver quickstart inside of the recovery utility class where property controlling values of
XAResourceRecovery
is reset in the code.
If you search to understand what should be the exact name of the system property or entry in jbossts-properties.xml
the rule of thumb is to take the short class name of the bean, add the dot
and the name of the property at the end.
For example let's say you want to redefine time period for
the periodic recovery cycle.
Then you need to visit the RecoveryEnvironmentBean,
find the name of the variable – which is periodicRecoveryPeriod
. By using the rule of thumb
will use the name RecoveryEnvironmentBean.periodicRecoveryPeriod
for redefinition of the
default 2 minutes value.
Some bean uses annotations @PropertyPrefix
which offers other way of naming for the property for settings it up. In case of the periodicRecoveryPeriod
we can use system property with name com.arjuna.ats.arjuna.recovery.periodicRecoveryPeriod
to reset it in the same way.
Thinking about XA recovery
I hope you have a better picture of recovery setup and how that works now.
The XARecoveryModule
has the responsibility for handling recovery of 2PC XA transactions.
The module is responsible for committing unfinished transactions and for handling orphans
by running registered XAResourceOrphanFilter
.
As you could see we configured two RecoveryModule
s – XARecoveryModule
and AtomicActionRecoveryModule
in the jbossts-properties.xml descriptor.
The AtomicActionRecoveryModule
is responsible
for loading resource from object store and if it is serializable and as the whole saved
in the Narayana transaction log then it could be deserialized and used immediately during recovery.
This is not the case often. When the XAResource
is not serializable (which is hard to achieve
for example for database where we need to have
a connection to do any work)
the Narayana offers resource initiated recovery. That requires a class (a code and a settings)
that could provide XAResources
for the recovery purposes. For getting the resource
we need a connection (to database, to jms broker...).
The XARecoveryModule
uses objects of two interfaces to get such information (to get the XAResource
s for recovery).
Those interfaces are
Both interfaces then contain method to retrieve the resources (XAResourceRecoveryHelper.getXAResources()
,XAResourceRecovery.getXAResource()
).
The XARecoveryModule
then ask all the received
XAResources
to find in-doubt transactions (by calling XAResource.recovery()
)
(aka. resource located transactions).
The found in-doubt transactions are then paired with transactions in Narayana transaction log store.
If the match is found the XAResource.commit()
could be called.
Maybe you wonder what both interfaces are mostly the same – which kind of true – but the use differs.
The XAResourceRecoveryHelper
is designed (and only available) to be used
in programatic way. For adding the helper amogst other ones you need to call
XARecoveryModule.addXAResourceRecoveryHelper().
You can even deregister the helper by method call XARecoveryModule.removeXAResourceRecoveryHelper
.
The XAResourceRecovery
is configured not directly in the code but via property
com.arjuna.ats.jta.recovery.XAResourceRecovery
. This is not viable for dynamic changes
as in normal circumstances it's not possible to reset it – even when you try to change the values by call of
JTAEnvironmentBean.setXaResourceRecoveryClassNames()
.
Running the recovery manager
We have explained how to configure the recovery properties but we haven't pointed down
one important fact – in the standalone application there is no automatic launch
of the recovery manager. You need manually to start it.
A good point is that's quite easy (if you don't use ORB) and it's fine
to call just
RecoveryManager manager = RecoveryManager.manager();
manager.initialize()
This runs an indirect recovery manager (RecoveryManager.INDIRECT_MANAGEMENT
) which spawns
a thread which runs periodically the recovery process. If you feel that you need to run the periodic
recovery in times you want (periodic timeout value is then not used) you can use direct management
and call to run it manually
RecoveryManager manager = RecoveryManager.manager(RecoveryManager.DIRECT_MANAGEMENT);
manager.initialize();
manager.scan();
For stopping the recovery manager to work use the terminate
call.
manager.terminate();
Summary
This blog post tried to introduce process of transaction recovery in Narayana.
The goal was to present settings necessary to be set for the recovery would work in an expected way for XA transactions and shows how to start the recovery manager in your application.