Thursday, April 21, 2022

Narayana on the Cloud - Part 1

In the last few months, I have been working on how distributed transactions are recovered in WildFly when this Application Server (AS) is deployed in Kubernetes. This blog post is a reflection on how Narayana performs on the cloud and the features it is still missing for it to evolve into a native cloud transaction suite.

Some (very brief) context

Narayana started its journey more than 30 years ago! ArjunaCore was developed in the late 1980s. Even though the theoretical concept of cloud computing was introduced by John McCarthy in 1961 [1][2], at the time of ArjunaCore’s development it was still considered only as a theoretical possibility. However, in the past two decades, the implementation of cloud computing has increased exponentially, dramatically changing the world of technology. As a consequence, Narayana (and its ArjunaCore) needs to step up its game to become a cloud native transaction suite that can be used in different cloud environments. This is an ongoing conversation the Narayana team has started a long time ago (for a detailed summary of Narayana's Cloud Strategy see [3]).

Narayana was introduced to the cloud through WildFly (note 1) on Kubernetes (K8s). In my recent experience, I worked on WildFly and its K8s operator [4] and I think that the integration between Narayana and WildFly works very smoothly on K8s [5]. On the other hand, when the pod hosting WildFly needs to scale down, the ephemeral nature of K8s does not get along with Narayana very well. In fact, ArjunaCore/Narayana needs to have a stable ground to perform its magic (within or without WildFly). In particular, Narayana needs to have:

  • A stable and durable Object Store where objects’ states are held
  • A stable node identifier to uniquely mark transactions (which are initialised by the Transaction Manager (TM) with the same node identifier) and ensure that the Recovery Manager will only recover those transactions
  • A stable communication channel to allow participants of transactions to communicate with the TM

In all points above, “stable” indicates the ability to survive whatever happens to the host where Narayana is running (e.g., crashes). On the other hand, K8s is an ephemeral environment where pods do not need a stable storage and/or particular configurations that survive over multiple reboots. To overcome this “incompatibility”, K8s provides StatefulSet [6] through which applications can leverage a stable realm. Particularly in relation to Narayana, the employment of StatefulSet and the addition of a transaction recovery module to the WildFly K8s Operator [7] enables this AS to fully support transactions on K8s. Unfortunately, this solution is tailor-made for K8s and it cannot be easily ported in other cloud environments. Our target, though, is to evolve Narayana to become a cloud transaction suite, which means that Narayana should also support other cloud computing infrastructures.

Our take on this

The Narayana team thoroughly discussed the above limitations that prevent Narayana from becoming a native cloud application. A brief summary is presented here:

  • A stable and durable Object Store where objects’ states are held
    Narayana is able to use different kinds of object stores; in particular, it is possible to use a (SQL) database to create the object store [8]. RDBMS databases are widely available on cloud environments: these solutions already cover our stability needs providing a reliable storage solution that supports replications and that is able to scale up on demand. Moreover, using a “centralised” RDBMS database would easen the management of multiple Narayana instances, which can be connected to the same database. This might also become incredibly useful in the future when it comes to evolving Narayana to work with multiple instances behind a load balancer (i.e. in case of replication)
     
  • A stable communication channel to allow participants of transactions to communicate with the TM
    Most cloud providers (and platforms) already offer two options to tackle this problem: a stable IP address and a DNS. Although both methods still need some tweaking for each cloud provider, these solutions should provide a stable endpoint to communicate with Narayana’s TM over multiple reboots
     
  • A stable node identifier to uniquely mark transactions (which are initialised by the Transaction Manager (TM) with the same node identifier) and ensure that the Recovery Manager will only recover those transactions
    This is the actual sticky point this blog post is about. Although it seems straightforward to assign a unique node identifier to the TM, it is indeed the first real logic challenge to solve on the path to turn Narayana in a cloud transaction manager

We discussed different possible solutions to this last point but we are still trying to figure out how to address this issue. The main problem is that Narayana needs stable storage to save the node identifier and reload it after a reboot. As already said, cloud environments do not provide this option very easily as their ephemeral nature is more inclined to a stateless approach. Our first idea to solve this problem was, “why do we not store the node identifier in the object store? Narayana still needs a stable object store (and this constraint cannot be dropped) and RDBMS databases on the cloud already provide a base to start from”. The node identifier is a property of the transaction manager that gets initialised when Narayana/ArjunaCore starts (together with all the other properties). As a consequence, it is not possible to save the node identifier in the object store as the preferences for the object store are also loaded during the same initialisation process! In other words, if the node identifier is stored in the object store, how can Narayana/ArjunaCore know where the object store is without loading all properties? Which came first: the chicken or the egg? Nevertheless, introducing an order when properties are loaded might help in this regard (i.e. we force the egg to exist before the chicken). Nevertheless, there is still a problem: what happens if the object store is shared between different instances of Narayana/ArjunaCore? For example, it might be very likely that a Narayana administrator configures multiple Narayana instances to create their object stores in the same database. In this case, every Narayana instance would need a unique identifier to tell which node identifier in the object store is its own. Recursive problems are fun :-) Even if we solve all these problems, the assignment of the node identifier should not be possible outside of Narayana (e.g. using system properties) and it should become an exclusive (internal) operation of Narayana. Fortunately, this is easier than solving our previous “chicken and egg” problem as there are solutions to generate a (almost) unique distributed identifier locally [9]. As things stand, we should find an alternative solution to port the node identifier to the cloud.

Looking at this problem from a different point of view, I wonder if there are more recent solutions to replace and/or remove the node identifier from Narayana. With this in mind, the first question I ask myself is “Why do we need a node identifier?”. Behind the hood, Narayana uses a recovery manager to try to recover transactions that have not completed their lifecycle. This comes with a caveat though: it is essential that two different recovery managers do not try to recover the same in-doubt transaction at the same time. That is where the node identifier comes in handy! In fact, thanks to the unique node identifier (that gets embedded in every global transaction identifier), the recovery manager can recognise if it is responsible for the recovery of an in-doubt transaction stored in a remote resource (note 2). This concept is best illustrated by an example. Let’s consider two different Narayana instances that initiate two different transactions that enlist the same resource. In this scenario, both transaction managers store a record in the shared resource. Let’s assume that the first Narayana instance starts the transaction before the second instance. While the first transaction gets to the point where it has sent prepare() to its enlisted resources, it is possible that the recovery manager of the second Narayana instance queries the shared resource for in-doubt records. If Narayana’s recovery manager was not forced to recover only transactions initiated by the same Narayana instance’s TM, this hypothetical scenario would have ended with an error: the recovery manager of the second Narayana instance would have rolled back the transaction initiated by the first Narayana instance, assuming that it was one of its own in-doubt transaction!

Cloud environments are encouraging (all of) us to come up with an innovative solution to reduce the footprint of Narayana/ArjunaCore. In particular, the node identifier is the challenge we are currently facing and the first real step to push Narayana onto the cloud. I will share any updates the Narayana team comes up with…and in the meantime, feel free to reach out to the team through our public channels (for example Gitter or our Google group narayana-users) to propose your ideas or discuss with us your take on this fundamental issue.

Note

  1. WildFly supports transactions thanks to the integration with Narayana
  2. It is possible to tell the Recovery Manager that it will be responsible for the recovery of in-doubt transactions initiated by different transaction managers (which are identified with different node identifiers). The only caveat here is that two Recovery Managers should not recover the same in-doubt transaction at the same time. To assign the responsibility of multiple node identifiers to the same Recovery Manager, the property xaRecoveryNodes [10] in Narayana’s JTAEnvironmentBean should be used.

Bibliography

[1] J. Surbiryala and C. Rong, "Cloud Computing: History and Overview," 2019 IEEE Cloud Summit, 2019, pp. 1-7, doi: 10.1109/CloudSummit47114.2019.00007.

[2] Garfinkel, Simson L. and Harold Abelson. “Architects of the Information Society: 35 Years of the Laboratory for Computer Science at Mit.” (1999).

[3] https://jbossts.blogspot.com/2022/03/narayana-community-priorities.html

[4] https://github.com/wildfly/wildfly-operator

[5] https://issues.redhat.com/browse/EAP7-1394

[6] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

[7] https://github.com/wildfly/wildfly-operator/

[8] https://www.narayana.io/docs/project/index.html#d0e459

[9] https://groups.google.com/g/narayana-users/c/ttSff9HvXdA

[10] https://www.narayana.io//docs/product/index.html#d0e1032