Wednesday, July 14, 2021

Narayana LRA Update

Introduction

This is another in a series of blogs about the compensation based approach to transactions that the team have been producing over the years. The latest such model is LRA (Long Running Actions), originally based on the 2006 OASIS LRA spec, which was recently been accepted by the Eclipse Foundation. The specification and the javadoc for the annotation API are also available.

Although LRA is a simple protocol it has a number of interesting features and one blog won’t do it justice. In this, the first one, I will describe how to create a simple microservice that executes in the context of an LRA. Section 1 explains how to create and run an LRA coordinator, section 2 describes how to create and run a participant, and in the final section there is a short review of the many blogs on the subject that the team have created during the past decade. These blogs are an excellent source of wisdom so I will try to avoid repeating old ground and refer the reader to those blogs for the details of the general approach (of which MP-LRA is just the latest incarnation).

In follow up blogs the team and I plan to cover, in no particular order:

  • how to participate in failure recovery (including participant, coordinator and network failures)

  • writing participants in languages other than Java

  • nesting LRA’s to structure business flows (into hierarchies)

  • various methods of triggering the cancellation of an LRA (resulting in the reliable invocation all compensation activities)

  • leaving LRA’s early

  • inspecting the progress of participants

  • inspecting failed participants (i.e. ones which have finished in a failed state)

  • restarting crashed participants on different endpoints

  • show services interacting with each other in different JVMs

  • show services running on OpenShift

  • leveraging quarkus and WildFly features to simplify the development process (using extensions and galleon feature packs)

  • addressing the demands that cloud infrastructures, such as OpenShift, place on LRA’s

  • investigate some best practices and future plans for managing the availability of coordinators and participants (including participant storage, different storage types such as databases and journals, and scaling of coordinators)

  • strategies for writing compensation logic

  • and I’m sure my colleagues will have plenty of other topics to add to this list.

The Example

A Long Running Action is an interaction between microservices such that all parties (called LRA participants) are guaranteed to be notified when the interaction finishes (in either a successful closing state or an unsuccessful cancelling state). A JAX-RS resource participates in an interaction by marking one or more of its methods with the @LRA annotation and by marking another of its methods with an @Compensate annotation. When a method marked with @LRA is invoked the resource is enlisted in the LRA. Enlisting with an LRA means that if the associated LRA is cancelled then the method annotated with @Compensate is invoked reliably (i.e. it will continue to be called until it is definite that the method executed successfully and that the coordinator received the response). The resource may also request that it be reliably notified if the LRA is closed by marking one of its methods with an @Complete annotation. Note that the LRA id is available to all annotated methods so that all parties know which context is Active.

In order to implement the guarantees stated in the previous paragraph, the narayana implementation requires that there are one or more LRA coordinators running in the system. A coordinator runs on behalf of many services and is responsible for starting and ending LRA’s and for managing the participant membership in the LRA, in other words it must be available for an interaction to progress (start, enlist and end). Similarly, participant resources must be available during the end phase of the LRA so they too must be restarted if they fail.

Note that the developer is normally unconcerned with the coordinator and a typical installation will run them as part of the platform, freeing up the developer to concentrate on the business of creating microservices. However, for the purposes of the blog, first I’ll indicate how you can create one. Note that there is a similar example in our quickstart repo. Later on we will make the blog examples available in the same repo.

Starting a coordinator

Here we show how to build and run a REST based coordinator from scratch as a java executable using the quarkus framework.

The Narayana LRA coordinator is a JAX-RS resource so it needs the quarkus resteasy extension and it needs to depend on the Narayana LRA coordinator implementation.

First generate a quarkus application using the quarkus-maven-plugin, specifying the resteasy-jackson and rest-client extensions which pull in everything we need for JAX-RS:

mvn io.quarkus:quarkus-maven-plugin:2.0.1.Final:create \
    -DprojectGroupId=org.acme \
    -DprojectArtifactId=narayana-lra-coordinator \
    -Dextensions="resteasy-jackson,rest-client"
cd narayana-lra-coordinator

To obtain coordinator support add the org.jboss.narayana.rts:lra-coordinator-jar:5.12.0.Final maven dependency to the dependencies section of the generated pom.xml file as follows:

    <dependency>
        <groupId>org.jboss.narayana.rts</groupId>
        <artifactId>lra-coordinator-jar</artifactId>
        <version>5.12.0.Final</version>
    </dependency>

Here I have chosen the latest release (5.12.0.Final) of the Narayana LRA coordinator. Because we just need the quarkus framework for running the coordinator, remove the generated example: rm -rf src.

Now build and start the coordinator on port 8080:

rm -rf src
mvn clean package
java -Dquarkus.http.port=8080 -jar target/quarkus-app/quarkus-run.jar &

If you want to check that the coordinator is running try listing the active LRA’s:

curl http://localhost:8080/lra-coordinator

By default the coordinator stores records in the filesystem in a directory called ObjectStore in the user directory (i.e. the value of the system property user.dir). You can change the location by adding a file called src/main/resources/jbossts-properties.xml with content:

<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <!-- unique id of an LRA coordinator -->
    <entry key="CoreEnvironmentBean.nodeIdentifier">1</entry>
    <!-- location of the LRA logs -->
    <entry key="ObjectStoreEnvironmentBean.objectStoreDir">target/lra-logs</entry>
    <!-- location of the communications store -->
    <entry key="ObjectStoreEnvironmentBean.communicationStore.objectStoreDir">target/lra-logs</entry>
</properties>

You can test the coordinator is operating correctly by trying to create an LRA using curl, for example.

curl -XPOST http://localhost:8080/lra-coordinator/start
http://localhost:8080/lra-coordinator/0_ffffc0a8000e_9471_60ed85da_a

Note the id of the new LRA in the output.

Now try closing the LRA (include the uid part of the LRA id followed by /close):

curl -XPUT http://localhost:8080/lra-coordinator/0_ffffc0a8000e_9471_60ed85da_a/close
Closed

You may verify that the coordinator no longer has a record of the LRA:

curl http://localhost:8080/lra-coordinator
[]

The output will be a json array ([]) of the LRA’s that the coordinator is managing. Check that the array does not contain the id of the LRA that you have just successfully closed.

Writing and running an LRA participant

We will generate and run a microservice that participates in an LRA using quarkus. A participant should be a JAX-RS resource so we will use the quarkus-maven-plugin, specifying the resteasy-jackson and rest-client extensions (the reason we need rest-client is that the narayana-lra participant support is implemented via a JAX-RS filter which will intercept business requests and needs to invoke the coordinator via JAX-RS calls):

cd ..
mvn io.quarkus:quarkus-maven-plugin:2.0.1.Final:create \
    -DprojectGroupId=org.acme \
    -DprojectArtifactId=narayana-lra-quickstart \
    -Dextensions="resteasy-jackson,rest-client"
cd narayana-lra-quickstart

There is an outstanding pull request for a narayana-lra quarkus extension (io.quarkus:quarkus-narayana-lra) which includes the necessary support for LRA. Since that isn’t available yet you need to manually do what the extension will do (which, fortunately, is neither difficult nor complex):

Include the following maven dependencies in the generated pom:

    <dependency>
      <groupId>org.eclipse.microprofile.lra</groupId>
      <artifactId>microprofile-lra-api</artifactId>
      <version>1.0</version>
    </dependency>
   <dependency>
      <groupId>org.jboss.narayana.rts</groupId>
      <artifactId>narayana-lra</artifactId>
      <version>5.12.0.Final</version>
    </dependency>

These two dependencies pull in support for the MicroProfile LRA annotations and the Narayana LRA implementation of the behaviour implied by these annotations.

We also need to tell quarkus (via the application.properties config file) to exclude some types from its CDI processing (these types are pulled in by the narayana dependency):

echo "quarkus.arc.exclude-types=io.narayana.lra.client.internal.proxy.nonjaxrs.LRAParticipantRegistry,io.narayana.lra.filter.ServerLRAFilter,io.narayana.lra.client.internal.proxy.nonjaxrs.LRAParticipantResource" >> src/main/resources/application.properties

And finally, we just need to update the generated Java JAX-RS resource source code to make use of Long Running Actions (which is the most interesting part for developers):

Open the file src/main/java/org/acme/GreetingResource.java in an editor and annotate the hello method with an @LRA annotation. In addition add two callback methods which will be called when the LRA is closed or cancelled.

// import annotation definitions
import org.eclipse.microprofile.lra.annotation.ws.rs.LRA;
import org.eclipse.microprofile.lra.annotation.Compensate;
import org.eclipse.microprofile.lra.annotation.Complete;
// import the definition of the LRA context header
import static org.eclipse.microprofile.lra.annotation.ws.rs.LRA.LRA_HTTP_CONTEXT_HEADER;

// import some JAX-RS types
import javax.ws.rs.PUT;
import javax.ws.rs.core.Response;
import javax.ws.rs.HeaderParam;
...

    // annotate the hello method so that it will run in an LRA:
    @GET
    @Produces(MediaType.TEXT_PLAIN)
    @LRA(LRA.Type.REQUIRED) // an LRA will be started before method execution and ended after method execution
    public String hello(@HeaderParam(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        return "Hello RESTEasy";
    }

    // ask to be notified if the LRA closes:
    @PUT // must be PUT
    @Path("/complete")
    @Complete
    public Response completeWork(@HeaderParam(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        return Response.ok().build();
    }

    // ask to be notified if the LRA cancels:
    @PUT // must be PUT
    @Path("/compensate")
    @Compensate
    public Response compensateWork(@HeaderParam(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        return Response.ok().build();
    }

Now build and start the application:

mvn clean package -DskipTests
java -Dquarkus.http.port=8081 -jar target/quarkus-app/quarkus-run.jar &

Ensure that the application and the coordinator are running on different ports, here I use 8081 for the application with the coordinator listening on port 8080 which is the default (I will show in a later blog how to change the default location of the coordinator).

Make a REST request to the hello method:

curl http://localhost:8081/hello

Just before the hello method is invoked an LRA will be started and the participant resource will be enlisted with the LRA. After the method finishes the LRA will be ended automatically (which is the default behaviour of the @LRA annotation). Ending the LRA triggers the termination phase in which the coordinator will invoke the @Complete method (called completeWork in the example) or the @Compensate method (called compensateWork in the example) of each enlisted participant depending on whether the LRA is closing or cancelling. If you want to verify that things are working as expected try updating the resource example to print the value of the HTTP header called Long-Running-Action (see the Java constant LRA_HTTP_CONTEXT_HEADER) which gets injected as a method parameter to each of the annotated methods. Alternatively run the participant in a debugger, for example if you break point inside the hello method and inspect the lraId method parameter and then compare it with what the coordinator knows (curl http://localhost:8080/lra-coordinator) then you should notice that the LRA is in the Active state. Then release the debugger and check back with the coordinator (the LRA will be gone since it should have completed). Note also that the lraId parameter should be the same as the one passed to the @Complete method so setting a break point in that method may also be illuminating.

Recap of what we’ve said before about compensations

12/2017 Narayana LRA: implementation of saga transactions

In this blog Ondra Chaloupka provided an overview of the Saga pattern and then identified those features of LRA that implement the pattern. His article also provided links to the Narayana code and quickstarts, and in particular introduced a worked example of how to run it in a cloud based environment using Minishift (OpenShift on your laptop).

12/2017 Saga implementations comparison

Another interesting article contributed by Ondra where he takes a different approach to explaining the concepts and mechanics of LRA’s. In this essay he compares and contrasts the Narayana LRA implementation with two popular Saga implementations: Axon framework and Eventuate.io. This approach is particularly useful for users already familiar with these other frameworks to get a rapid understanding of what LRA is offering.

11/2017 A comparison of Long Running Actions with a recent WSO paper

Tom Jenkinson proves "a high-level comparison of the approach taken by the LRA framework with a paper released to the 2017 IEEE 24th International Conference on Web Services - “WSO: Developer-Oriented Transactional Orchestration of Web-Services”." describing the various concepts introduced in both approaches: ordering compensations, idempotency, structure, ease of use, locking and orchestration and nesting of activities.

Tom describes LRA thus: "This specification is tailored to addressing needs of applications which are running in highly concurrent environments and have the need to ensure updates to multiple resources have atomic outcomes, but where locking of the resource manager has an unacceptable impact on the overall throughput of the system. LRA has been developed using a cloud first philosophy and achieves its goal by providing an extended transaction model based on Sagas. It provides a set of APIs and components designed to work well in typical microservice architectures."

06/2017 Sagas and how they differ from two-phase commit

Yet another article provided by Ondra, a busy year for him. Each of Ondra’s 2017 articles, though focused on communicating what LRA is and when, where and why it can be useful, do an excellent job at covering different facets of the compensation based approach to achieving distributed consistency. Ondra provides an extensive overview of the various concepts involved in these two transaction models [sagas and LRA’s].

Of particular interest is the extensive set of references provided at the end of the blog.

10/2016 Achieving Consistency in a Microservices Architecture

This article provides an overview of the problem domain that LRA addresses and draws attention to some of the difficulties that naive approaches run into when attempting their resolution. Although the article, written in 2016, predates the Narayana LRA implementation it does outline the basic LRA protocol.

07/2013 Compensating Transactions: When ACID is too much

An epic four part series contributed by Paul Robinson covering many aspects of the compensation based approach to transactions:

Part 2: Non-transactional Work. This part will cover situations where you need to coordinate multiple non-transactional resources, such as sending an email or invoking a third party service.

Part 3: Cross-domain Distributed Transactions: This part covers a scenario where the transaction is distributed, and potentially crosses multiple business domains.

Part 4: Long-lived Transactions. This part covers transactions that span long periods of time and shows how it’s possible to continue the transaction even if some work fails.

05/2015 XA and microservices, 05/2014 Transactions and Microservices and 04/2015 Microservices and transactions - an update

Three posts in which Mark allays a number of fears, concerns and fallacies that developers may have with combining transactions with microservices.

03/2011 Slightly alkaline transactions if you please …

Mark introduces his short post with: "Given that the traditional ACID transaction model is not appropriate for long running/loosely coupled interactions, let’s pose the question, “what type of model or protocol is appropriate?”", and then he goes on to answer the question he poses. Along the way we find definitions and links to papers that define various extensions to the traditional model giving us the "lay of land", so to speak, to enable us to navigate our way to an understanding of alternate models.

03/2011 When ACID is too strong

Another short post in which Mark presents the motivation for long-running activities.

03/2011 REST, Cloud and transactions

Mark motivates the case for REST based transaction protocols as a complement to WS-Transactions. LRA is such a REST based protocol and his post provides important background material on why LRA exists alongside WS-BA.

10/2011 nested transactions 101

Some useful background information on nested transactions (partially motivates nested LRA’s)


1 comment:

Michael Musgrove said...


As promised (in the article) I uploaded the materials to our github repo (https://github.com/jbosstm/artifacts/tree/master/jbossts.blogspot/14-07-2021).