Wednesday, June 18, 2025

Narayana is now in Commonhaus Foundation!

Hi,

Further to the earlier post about Narayana in a foundation (https://jbossts.blogspot.com/2025/03/request-for-input-narayana-in-foundation.html) I am pleased to share here that the Narayana project has successfully transferred to Commonhaus Foundation (as part of a move with WildFly - please see the WildFly announcement here https://www.wildfly.org/news/2025/04/30/WildFly-Commonhaus/).

You can find us now listed at: https://www.commonhaus.org/.

Please feel free to reach out on Zulip to chat about this: https://narayana.zulipchat.com/

Congratulations to the project on this new home!

Tom

Monday, March 31, 2025

Request for input: Narayana in a foundation?

The Narayana project has been very successful for many, many years and we are very grateful for the contributions that our upstream community give to us through discussing, reporting issues, and providing code contributions to help the project thrive.

I would like to share an update with our community that we are considering to move the Narayana project to a vendor-neutral software foundation. By doing this we can hopefully further expand our community and continue to improve openness and transparency in the project.

Important Considerations

Moving to a foundation is not a trivial task, so it’s critical that the choice we make is a net benefit to our community. To help ensure this, there are a number of key factors we’re looking at when evaluating what foundation would be the best fit:

Flexibility to continue shipping third-party components using a wide array of Open Source Initiative (OSI)-approved Open Source licences.
Maintain as much as possible our current release processes.
Retain independence in decision making, particularly on technical matters.
Ensure Narayana remains visible and recognizable within a foundation’s potentially larger portfolio of projects.
Ensure Narayana can make decisions based on technical merits, not foundation-imposed options.
Provide flexibility in using Open Source Initiative (OSI)-approved Open Source licences for Narayana.

Support and Alignment with Red Hat Values

Red Hat business leaders are fully behind this move. Red Hat is dedicated to participating in and supporting vendor-neutral collaboration projects, such as the Linux kernel, Kubernetes, and OpenJDK. We seek the same for Narayana.

Please let us know what you think

We invite members of our Narayana community, inside and outside of Red Hat to join the discussion in the community either on the blog, in our users forum (https://groups.google.com/g/narayana-users) or over on Zulip (https://narayana.zulipchat.com/).

Monday, March 17, 2025

Narayana and its relationship to Red Hat middleware strategy

Hi everyone,

You might have already seen that Red Hat announced significant changes to its middleware strategy last month (if not, please do check out the relevant “Red Hat Blog” article: Evolving our middleware strategy [1]) and so I want to speak a little to the change and its relevance to our Narayana project.

As you may know, Narayana is a part of a number of Red Hat products, in particular Red Hat’s JBoss Enterprise Application Platform product and so this makes the strategic decision relevant to the Narayana project. That said, a key point in that article from the “Red Hat Blog” with regards to our Narayana project is that all transitioning Red Hat technology will remain open source and continue to follow an upstream-first development model. So as well as the technology still relying on being able to upstream-first (in projects like Narayana), it’s also that this upstream should remain open source (you can find what open source means at Red Hat over here [2]). Not only is the Narayana source code open source, but moreover its project operates in an open source manner, exhibiting the principles of open source and gratefully benefits from a healthy community of users and contributors. This will help us to keep innovating in the area of transactions as we move forwards.

I will also take this opportunity to add a “Thank you” for being part of our Narayana community - I am excited to see the results of what we achieve together next!

Tom Jenkinson

[1] https://www.redhat.com/en/blog/evolving-our-middleware-strategy

[2] https://www.redhat.com/en/about/open-source

Monday, January 6, 2025

Managing the availability of LRA participants

This post is a continuation of a series of jbosts blogs that discuss the MicroProfile LRA specification.

Services manage their workloads by providing endpoints to an LRA coordinator which in turn uses those endpoints to drive the LRA protocol forward thereby enabling the construction of reliable services. These endpoints may need to be modified over the long run so it ought to be possible to replace them with different ones in response to changes to the environment in which the service executes. Although the specification does not discuss how the endpoints can be replaced, the Narayana LRA REST API for the coordinator includes Microprofile OpenAPI documentation for replacing endpoints.

There are various administrative and management reasons for why the capability can be useful, such as controlling where termination handling is to take place, or to facilitate service replacement, etc. It may also be desirable for work completion, compensation, status reporting and clean up activities to be handled on different endpoints and at different times and this goal is facilitated via annotations including @Compensate, @Complete, @Status, @Forget and @AfterLRA.

When a participant does work in the context of a long running action, a “recovery URL” is created which services may use to associate their work with various management actions such as changing the participant endpoints as the action proceeds, after all a long running action can be of arbitrary duration and the needs of a service may change as the action evolves. The example I created for this post halts the JVM during “complete”, asks the user to send a curl request to the LRA coordinator to provide it with a new participant completion endpoint, restarts the participant on the new endpoint and waits for recovery to resend the completion callback to the new endpoint.

By leveraging the feature admins may proactively react to changing conditions (connectivity, throughput, functionality updates, etc) and be able to tune and or reconfigure the environment accordingly, perhaps bringing up a more reliability aware service that more intelligently operates within the more limited environment.

Build and start a coordinator on port 8080

Use the quarkus-maven-plugin to create a project for the coordinator, adding a dependency on maven artifact org.jboss.narayana.lra:lra-coordinator-jar:0.0.10.Final to the resulting pom. Also specify that the build should produce an uber jar so that the coordinator can run standalone:

    mvn io.quarkus:quarkus-maven-plugin:3.3.1:create -DprojectGroupId=org.acme -DprojectArtifactId=narayana-lra-coordinator -Dextensions="rest-jackson,rest-client"
    cd narayana-lra-coordinator
    rm -rf src/test src/main/java # the sources created by the example aren't required
    echo "quarkus.package.jar.type=uber-jar" > src/main/resources/application.properties
    # don't forget to add a dependency on maven artifact: org.jboss.narayana.lra:lra-coordinator-jar:0.0.10.Final
    ./mvnw clean package

and then start it on port 8080 by running the resulting jar

   java -jar target/narayana-lra-coordinator-1.0.0-SNAPSHOT-runner.jar &

Build and start a participant on port 8081 and run an LRA but halt the JVM before closing it

The service will be quite basic:

@Path("/halt")
public class MigratableResource {
    private static final AtomicBoolean halt = new AtomicBoolean(false);

    @LRA(value = LRA.Type.REQUIRED)
    @PUT
    public void doInTransaction() {
        halt.set(true); // halt when compensate or complete are called
        // when the business method finishes the LRA is closed and the complete endpoint will be called
    }

    @PUT
    @Path("/compensate")
    @Compensate
    public Response compensate() {
        return Response.ok().build();
    }

    @PUT
    @Path("/complete")
    @Complete
    public Response complete(@HeaderParam(LRA_HTTP_RECOVERY_HEADER) String recoveryUrl) {
        if (halt.get()) {
            int port = 8082;
            String completionUrl = String.format("http://localhost:%d/halt/complete", port);

            System.out.printf("Ask the coordinator to send the completion notification on a new endpoint using:%n");
            System.out.printf("curl -X PUT %s -d '<%s>; rel=complete'%n", recoveryUrl, completionUrl);
            Runtime.getRuntime().halt(1);
        }
        System.out.printf("completed%n");
        return Response.ok().build();
    }
}

The interesting part happens during completion where the JVM is halted. Notice that the curl command for migrating the completion endpoint is printed prior to halting.

Now build and run the participant on port 8081 - the maven project is available from the narayana artifacts maven repository.

cd <participant directory>
mvn clean package
java -Dquarkus.http.port=8081 -jar target/quarkus-app/quarkus-run.jar &

and then call the service method using the curl utility, or otherwise:

curl -X PUT -I http://localhost:8081/halt

The service method is annotated with just @LRA(value = LRA.Type.REQUIRED) so when it finishes the completion callback will be invoked by the coordinator. Make a note of the curl request printed by the completion callback just before it halts the JVM. An example is (the Uids will change on each run):

curl -X PUT http://localhost:8080/lra-coordinator/recovery/0_ffffc0a801c7_9d57_677ad0a4_2/0_ffffc0a801c7_9d57_677ad0a4_5 \
  -d '<http://localhost:8082/halt/complete>; rel=complete'

Notice that the payload of the HTTP PUT request includes the specification of the new completion callback, namely <http://localhost:8082/halt/complete>; rel=complete.

The new endpoint will be used on the next recovery pass which is every two minutes by default.

Finally restart the service on the new endpoint (port 8082):

java -Dquarkus.http.port=8082 -jar target/quarkus-app/quarkus-run.jar &

When the coordinator next runs a recovery scan it should use the new endpoint and the service will report that it has completed its' service work by printing the text “completed” when the completion endpoint is by the coordinator.

Monday, September 30, 2024

Coping with Failures during Long Running Actions

In this brief note I want to draw attention to some of the features in the LRA protocol that can help service writers manage failures. LRA is a transaction protocol that provides certain desirable properties for building reliable systems such as Atomicity, (eventual) Consistency and Durability. Providing this level of assurance is non trivial but the protocol provides a simple model that can help participants to easily play their part in enabling such systems.

LRA is not just for orchestrating services, it is as equally as important for managing failures. Apart from the specification I have not seen many posts, articles etc covering this important topic, and it is this deficit that I’d like to address in some posts. I had wanted to kick off with an article and demonstration of participant failover but I hit an issue while writing the demo which we need to release the fix for before I can showcase that. So instead, in this post I’ll just bring to the readers attention one or two, but by no means all, of the main features that service writers can use to help them to create more reliable microservices, a preview if you like, before going into more depth in a subsequent post.

Some remarkable items to consider include:

Failing participants must be restarted. There is an option to change the callbacks on restart, any of the endpoints can be changed, even passing over responsibility for, say, the compensation to some other microservice. Likewise, failing coordinators must be restarted if progress of LRAs is to be made.
There is an @Status annotation on participants that the coordinator can use to monitor participant progress and to enable participants to fully participate in the recovery protocol, in particular there is support for non-idempotent compensate endpoints; if there is an @Status endpoint and the compensate endpoint has previously returned a 202 Accepted HTTP status code, then it will periodically poll the status endpoint until the participant reports that it has reached an end state. The @Forget annotation is used by the coordinator to inform the participant that it is free to clean up.
There are state transitions which participants use to notify the coordinator of failures (FailedToCompensate and FailedToComplete) and of transitory states (Compensating and Completing).
Managing timeouts, although the actions supported by the protocol are long running careful choice of time limits for actions can bound failure windows and reduce the need for complicated recover procedures.
And of course there is support for nested Long Running Actions which is a jewel in the toolkit for building reliable distributed systems.

That’s all for now - I’ve deliberately kept the ideas brief and high level so that they can be explored in greater depth later.

Tuesday, June 25, 2024

Some experiments in migrating transaction logs

Transaction stores

Some time ago I prototyped a Redis based implementation of the SlotStore backend suitable for installations where nodes hosting the storage can come and go making it well suited for cloud based deployments of the Recovery Manager.

In the context of the CAP theorem of distributed computing, the recovery store needs to behave as a CP system, ie it needs to be able to tolerate network partitions and yet continue to provide Strong Consistency. Redis can provide the strong consistency guarantee if the RedisRaft module is used with Redis running as a cluster. RedisRaft achieves consistency and partition tolerance by ensuring that:

acknowledged writes are guaranteed to be committed and never lost,
reads will always return the most up-to-date committed write,
the cluster is sized correctly: a RedisRaft cluster of 3 nodes can tolerate a single node failure and a cluster of 5 can tolerate 2 node failures, … ie if the cluster is to tolerate losing N nodes then the cluster size must be at least 2*N+1, thus the minimum cluster size is 3 and the reason for having an odd number of nodes in the cluster is to avoid “split brain” scenarios during network partitions; an odd number guarantees that one side of the split will be in the majority.

During network splits the cluster will become unavailable for a while, ie the cluster is designed to survive failures of a few nodes in the cluster, but it is not a suitable solution for applications that require availability in the event of large net splits, however transaction systems favour Consistency over Availability.

A key motivator for this new SlotStore backend is to address a common problem with using the Narayana transaction stores on cloud platforms when scaling down a node that has in doubt transactions which can leave them unmanaged. Most cloud platforms can detect crashed nodes and restart them but this must be carefully managed to ensure that the restarted node is identically configured (same node identifier, same transaction store and same resource adapters). The current cloud solution, when running on Openshift, is to use a ReplicaSet and to veto scale down until all transactions are completed which can take an indeterminate amount of time, but if we can ask another member of the deployment to finish these in doubt transactions then all but the last node can be safely shutdown even with in doubt transactions. The resulting increase in availability in the presence of node or network failures is a significant benefit for transactional applications which, after all, is key reason why businesses are embracing cloud based deployments.

Remark: Redis is offered as a managed service on a majority of cloud platforms which can help customers to get started with this solution. But note that standard Redis excludes the RedisRaft module which is a requirment for use as a transaction store.

A Redis backed store

Redis is a key value store. Keys are stored in hash slots and hash slots are shared evenly amongst the shards (keys -> hash slots -> shards), the redis cluster specication contains the details. Re-sharding involves moving hash slots to other nodes, impacting performance. Thus, if we can control which hash slots the [slot store] keys map onto then we can improve performance under both normal and failure conditions. This periodic rebalancing of the cluster can be optimised if keys belonging to the same recovery manager are stored in the same hash slot, additionally having the keys, for a particular recovery node, colocated on a single cluster node is good for the general performance of the transaction store.

Also noteworthy is that the keys mapped to a particular hash slot can operated upon transactionally, which is not the case for keys in different slots, meaning that no inter-node hand-shaking is required. This feature opens up the possibility, perhaps, of allowing concurrent access to recovery logs by different recovery managers - but that’s something for a future iteration of the design, but if the logs in a store are shared then be aware that some Narayana recovery modules cache records so those implementations would need to re-evaluated, noting in particular that Redis has support for optimistic concurrency using the watch API which clients can use to observe updates to key values by other recovery managers.

Key space design

A recovery manager has a unique node identifier. We’d like to be able to form “recovery groups” such that any recovery manager in the group can manage transactions created by the others, but not at the same time. To this end we assign a “failoverGroupId” to each recovery manager and use that as the Redis key prefix. This will force all keys created by members of the failover group into the same hash slot, a cloud example of this idea is that the pods in a deployment would all share the same failoverGroupId so any pod in the deployement can take over when the deployment is scaled down.

Failover

Failover involves detecting when a member of the “recovery group” is removed from the cluster and to then migrate the keys to another member of the group. I added an example to the LRA recovery coordinator and used the jedis redis API rename command to “migrate” the keys which is an atomic operation.

Issues

The performance of Redis Raft in this implementation of the SlotStore backend is poor (more than 4 times slower than the default store); I have not invested any effort on improving it but may follow up with another post to discuss throughput since it is a general issue that needs to be solved for any cloud based object store, examples of topics to investigate include pipelining redis commands (similar to how we batch writes to the Journal Store), using Virtual Threads, etc.

We are therefore investigating other alternatives, including an Infinispan based slot store backend - Infinispan now supports partition tolerance so it has become a suitable candidate for a transaction store. Such a store will produce many of the benefits of a Redis store although its performance will be a key implementation constraint.

This design for the key space may not be suitable for transactions with subordinates or nested transactions or for ones that require participant logs to be distinct from the transaction log, such as JTS or XTS. I say may since a modification to the design should accomodate these models.

Assumptions

The cloud platform administrator is responsible for:

detecting and restarting failed nodes;
issuing the migrate command on one of the remaining nodes
for detecting when the deployment is scaled down to zero with pending transactions (including orphans) and emitting a warning accordingly

Example of how to migrate logs

The demo presents the use case of migrating logs between LRA coordinators. To run the demo you will need to:

Clone and build a narayana git branch with support for a Redis backed store.
Start a 3-node cluster of Redis nodes running RedisRaft.
Build and start two LRA coordinators with distinct node id’s, node1 and node2.
Start an LRA on the first coordinator and then halt it to simulate a failure.
View the redis keys using the Redis CLI noticing that the keys embed the node id of the owning coordinator.
Ask the second coordinator to migrate the keys from node1 to node2.
Coordinators maintain a cache of LRAs, but since this is just a PoC I haven’t implemented refreshing internal caches so you will need to simulate that by restarting the second coordinator.
When the first periodic recovery cycle runs (the default is every 2 minutes) the migrated LRAs will be detected which you can verify (curl http://localhost:50001/lra-coordinator/|jq).

Please refer to the demonstrator instructions for the full details.

Notes

The JIRA issue and branch is JBTM-3762.

the implementation is is in the slot store directory
the tests can be ran using mvn test -Predis-store -Dtest=RedisStoreTest#test1 -f ArjunaCore/arjuna/pom.xml and assume that a redis cluster is running on the test machine
the demo is work in progress and is strictly a PoC
redis raft implementation: git clone https://github.com/RedisLabs/redisraft.git and build it with: cmake and make

Wednesday, September 6, 2023

A Review of Recent Narayana Releases

The last four releases of Narayana have brought some noteworthy changes, closing 86 issues in the process, which I’d like to summarise in this brief post. The contributions have come from both the broader community and the core Narayana team, thank you for that. The changes include bug fixes, dependency upgrades and tasks and features.

Community

Improve inclusiveness by building a community of users

We reviewed our existing guidance, adding clarifying text to the contributing guide and added a SECURITY.md file. The latest snapshot adds an email address for reporting security issues.

Conscious Language

We also reviewed our materials to ensure that we use welcoming language, free from offensive, othering, or otherwise problematic communication styles.

New Additions/Features

All maven modules were migrated from Java EE to Jakarta EE (which included the main narayana repo plus the quickstart, jboss-transaction-spi and performance repos).

There is now a BOM for narayana (JBTM-3735). To depend on the correct versions in your projects just include the following dependency:

      <dependency>
        <groupId>org.jboss.narayana</groupId>
        <artifactId>narayana-bom</artifactId>
        <version>latest version</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>

The new license for Narayana is Apache License 2.0, it replaces LGPL and provides consumers with more flexibility when releasing their own software products that incorporate Narayana (JBTM-3764).

Issue JBTM-3734 was resolved by a community contributor, it introduced support for JEP-444: Virtual Threads. Virtual threads “dramatically reduce the effort of writing, maintaining, and observing high-throughput concurrent applications”. The change replaced many occurrences of the synchronized java keyword with ReentrantLock which in most usages, but not all, should be semantically equivalent. The change is an API breaking change so we released the update in a major version, 7.0.0.Final.

Removal of features

All modules have been migrated to Jakarta EE and Java EE is not supported.

Release 6.0.0.Final removed the transformed Jakarta maven modules (ones that ended in “-jakarta”).

The OSGi module is no longer available, please refer to the issue for the reason why this decision was made.

Quickstarts showing integration of Spring and Tomcat with Narayana have been temporarily disabled because at the time of the Jakarta migration, Tomcat and Spring had not yet added Jakarta support to their offerings. Issue JBTM-3803 was created for them to be re-enabled when Jakarta variants become available.

Long Running Actions for MicroProfile (LRA)

Release 6.0.0.Final was certified against MicroProfile LRA 2.0.

We added a Narayana specific feature to allow LRA participants to store data with the coordinator (3rd section) during the registration phase. The feature is configurable, using the MicroProfile Config approach, because some users may prefer not to entrust their business data with the coordinator.

The bug fix for JBTM-3749 facilitated the integration of LRA into WildFly, LRA support in WildFly was added with issue WFLY-14869 by Martin Stefanko, an active contributor to LRA. JBTM-3749 provided a partial fix for JBTM-3552 (Do not rely on thread locals for propagating LRA context) and it also included a doc update recommending that users explicitly set the LRA context when JAX-RS resource methods perform outgoing JAX-RS invocations.

The latest snapshot of narayana includes documentation about configuring the concurrency of the LRA coordinator start method, the details are in issue JBTM-3753.

Transaction Logging

Transaction managers log data in order to provide the Durability property of a transactions. Narayana supports a variety of persistence stores, including logging to a database which we call the JDBCStore. JBTM-3724 included a quickstart for this store and JBTM-3754 introduced an option to supply the DataSource for connecting to the store at runtime for use with the Quarkus extension for JTA transactions.