Saturday, April 11, 2015

Microservices and transactions - an update

It's almost a year since I wrote my first thoughts on how transactions fit into the world of microservices and it's time for an update. I've had the pleasure of working in the field of fault tolerance and distributed systems for almost 30 years. In that time I've worked with some great friends and colleagues from within the same companies or across different companies on transactions, both traditional atomic (ACID) transactions and extended transactions. Back when I was doing by PhD on transactions and replication, weak consistency replication was in its infancy but there were already a range of extended transaction protocols.

Over the years we've seen these transaction protocols move from research into standards and industrial usage, with efforts such as the OMG's Additional Structuring Mechanisms for the OTS and WS-Transactions from OASIS. Although not as pervasive as ACID transactions, these additions to a developer's repertoire have seen some uptake. Now I'm not someone who believes transactions of any form should be used in all situations, but neither do I believe that they are so bad to be completely useless. Yet throughout the work we did for both Web Services and REST there were some groups that vehemently fought against transactions, often stating that applications should ensure that any transactional changes to state should be isolated within a service and not span multiple services, i.e., only local transactions should be supported.

As I said earlier, I don't believe that transactions, or even distributed transactions, are necessarily right for every application, or even some applications that use them today. Transactions (let's assume ACID for now) provide a nice and simple model for building applications, particularly if the implementation you use supports nested transactions. It's only natural for a developer who finds this structuring mechanism useful to expand it across objects, services and even machines. In a closely coupled environment when transactions last a few seconds this continues to be a useful approach. However, as we've seen and discussed many times before, they become problematical in loosely coupled environments. Hence the development of certain extended transaction models.

Structuring your applications so that all of the state changes which occur do so within a single object or service is often a lot easier said than done. Especially if you are building from components (services or objects) that have been developed over time by different groups or companies. It's easier to do if you build a Big Ball of Mud, which hopefully is not what you want to accomplish by going down the microservices route! Whether your stateful services interact directly with each other via RPC, say, or through a reliable, yet asynchronous messaging bus with queues and topics, such as JMS, it is fairly inevitable that your applications will have state updates which need to occur as some unit of work (note I didn't say "atomic" there). Some of these units of work will need to be atomic (though not necessarily ACID). Some will be fine with relaxed constraints, such as using forward compensation based approaches. Yes, I'm sure we'll hear people suggesting that atomic transactions aren't useful at all in these environments due to performance problems, but if they spent the time understanding the kinds of optimisations that mature transaction implementations have had in place for decades, then perhaps they'd realise that whereas there may be some overhead it's not as black-and-white as they may believe, or want you to believe. And please realise that XA is just one specific standard for transactions - it has its pros and cons, but any downsides you may have with XA shouldn't be assumed to carry over to the plethora of other transaction models and standards out there!

Let's return to the original topic: microservices and transactions. Where do (or should) the two come together? What I really don't want to repeat with microservices is the anti-transaction arguments we had for SOA. Get over it! Some applications will find them useful, whether atomic (ACID) or extended. Therefore, let's just assume that point for the rest of this discussion. As you develop your microservice(s) and hopefully take the approach of making each "do one thing well" as well as "be as simple as possible yet no simpler", you'll want to string them together; you'll want to have an invocation on one service trigger an invocation on another, or even more than one; you'll want to update the state of a number of services together. (I'll talk about weak consistency in a separate article.) You'll need to determine whether or not these updates have to occur atomically - just recognise the trade-offs this may mean to your application and services. As I've mentioned already, atomic transactions (local or distributed/global) aren't your only option though and one of these additional protocols could be better suited to your services and the way in which they have been constructed. And of course you can mix-and-match: just because some groupings of services may be better suited to a compensation-based model does not preclude you from using atomic transactions elsewhere - or even with the same services for for different operations.

In short what I hope anyone developing microservices will get from this is an understanding that transactions, both local and global, are not anathema to SOA/microservices. They may not be the default mechanism for you to choose when building your services, but they most certainly should be part of a good developer's palette. Having to implement equivalent capabilities in your infrastructure or the services themselves (consistency in the presence of arbitrary failures, opaque recovery for services, modular structuring mechanisms, span different communication patterns etc.) is something you shouldn't have to do because it's a monumental effort in its own right. A transaction manager microservice is something that should be available in many enterprise environments!

Friday, March 20, 2015

How we evaluate performance improvements for Narayana

I would like to highlight a piece of work that our team (and special thanks to Mike) have been working on for a while now which is to provide a framework and methodology for us to use when developing performance improvements for Narayana.

The article itself is on our developers wiki over here:
https://developer.jboss.org/wiki/PerformanceGatesForAcceptingPerformanceFixesInNarayana

After you have read the article, he has also started a discussion on our forum where we can answer questions about the approach we have taken and whether our community have any comments/suggestions to what we can do to further refine this:
https://developer.jboss.org/message/922334?et=watches.email.thread#922334

It's worth pointing out that this is all fully integrated into our existing checks for pull requests on the Narayana repo. So not only will we execute about 12 hours of unit and integration tests on your modifications - we will now also run a set of tests to check the performance impact of functional changes or evaluate the effectiveness of improvements targeted at this area!

The article links to various tests that are executed so I won't expand here. What I can say is that although the suite is not exhaustive, if you propose a change that needs a specific style of test to verify its impact we will be really pleased to accept those changes too.

All of this performance testing is open source. You can see the tests we execute, the configurations we use and the results we have obtained on our hardware. If you want to run the tests yourselves it should be a case of "git clone" and a few simple steps as documented in our performance repo.

We have created this framework for anyone that wants to work on our project and we hope you find it useful! As a reminder - if you do wish to contribute to Narayana we recommend you take a look at this article to get started: https://developer.jboss.org/wiki/GetInvolvedWithNarayana

Thanks!

Tuesday, December 23, 2014

Announcing the release of Narayana 5.0.4

I am very pleased to share with you the news that the release of Narayana 5.0.4 is now officially available from all our usual outlets and just in time for Christmas!

This release of Narayana contains a phenominal amount of work from the team. As well as the usual improvements to performance and usability that can be expected of our Narayana 5.x series, it is also wraps up several new features that you might have heard us talking about here on the blog or over in the community forums over the last year or so: the Narayana Transaction Analyser (NTA), Compensations, STM and running the transaction manager in interesting places such as Docker, Android etc being chief among them. The full release notes for Narayana are over here. The release notes for 5.0.4 itself are over here.

You can get it:
  • As a distribution format from our website
  • Within the latest nightly builds of WildFly
  • From the JBoss.org Maven repository

All of our contact details are available on the community page and we look forward to hearing what you think of our project over on IRC or our forums.

Have a great 2015!
Tom

Wednesday, May 28, 2014

Bringing Transactional Guarantees to MongoDB: Part 1

In this blog post I'll present some recent work we've been doing to bring stronger transactional guarantees to MongoDB. In part 2 I'll present a code example that shows this in action in WildFly 8.

What requirements are we fulfilling? 

1) Updating multiple MongoDB documents in a single transaction
2) Support for sharded environments, without harming scalability
3) Support for global transactions spanning other datastores and traditional relational databases.
4) A middleware solution that's simple for developers to use.

This post covers the background and explains why a compensating transaction (vs an ACID transaction) could be the best fit to meet the above requirements. Part two in this series is more implementation focused. It presents a code example, showing you how you can use the technology, whilst omitting a lot of the theory (that is covered in this post).

Background

NoSQL datastores were originally built as bespoke, in-house solutions, to meet scalability requirements that it was felt relational databases couldn't meet. The general thinking was that ACID transactions would harm scalability and that it was better to workaround that requirement. However, as NoSQL adoption spread beyond its in-house roots, it became clear that many applications do indeed need a level of reliability that transactions can bring.

Typically a NoSQL datastore offers atomic updates to single items, such as a document or key-value (more generally, an aggregate). Therefore, structuring data into aggregates, can mean that the application never needs to update more than one document at a time, within the same transaction. Mostly, this could be true. However, there are cases in which it's not possible to structure the data in this way. Take the classic example of moving funds from one user's account to another. It doesn't make sense to store all users in the same aggregate as it will create a lot of contention and a very large aggregate! Therefore the only option is to deal with each user's data in separate atomic operations. Without a transaction spanning these operations, the application runs the risk of becoming inconsistent in the event of failure. Another example is when the application needs to make updates to a NoSQL datastore in the same transaction as an RDBMS or JMS interaction. Typically NoSQL datastores don't support this.

MongoDB and other NoSQL datastores scale through a combination of sharding and replica-sets. I won't go into the specifics here on how this works. However, the key point is that the data becomes distributed over several nodes. Updating multiple data items atomically requires a distributed transaction. As well as being complex to implement, under certain workloads, a distributed ACID transaction can limit scalability. I suspect it is for these reasons that very few NoSQL datastores support ACID transactions in a sharded environment.

The blocking nature of an ACID transaction is the key property that limits scalability. For the duration of the transaction, external readers and writers are blocked until the transaction completes. For contended data, this can result in lots of waiting. The longer the transaction takes to run, the worse the problem. As well as delay introduced by applications, distributing data over a cluster or multiple databases can also result in longer running transactions. However, for data with low-contention, it's possible that an ACID transaction doesn't harm your scalability, in which case you should consider using them as they are a lot simpler to deal with.

Compensating transactions offer an alternative to ACID transactions. They remove the blocking property by relaxing Isolation and Consistency. Despite offering fewer guarantees than ACID transactions, they offer significantly more guarantees than forgoing transactions altogether. Furthermore, in many applications these guarantees are enough, and any more are superfluous. ACID vs Compensating transactions are discussed in more detail in my blog series "Compensating Transactions: when ACID is too much". Here I also show a pattern for working around the relaxed properties.

Using Compensating-transactions with MongoDB

Through Narayana and WildFly's compensating transactions feature, we can fullfil the requirements stated at the start of this blog as follows:

1) Multiple document updates. 
A compensation handler is logged with each document update. In the case of failure, or if the application elects to cancel the transaction, any partially completed work is compensated, resulting in an atomic outcome. Narayana can build on the atomic update mechanism provided by MongoDB, by logging a reference to the compensating handler in the updated document, in the same atomic update as the business-logic's update. This ensures that either i) both the business-logic update, and compensating handler are persisted; or ii) neither is persisted. The handler can be removed at the end of the protocol. It is this construct that the rest of the protocol is built on, allowing recovery to be achieved regardless of what stage the protocol is in during failure.


2) Sharded environments. 
This approach builds on the atomic-update primitive offered by MongoDB. As this feature works in a sharded environment, so does the compensating-transaction that is built upon it. Furthermore, scalability is maintained due to i) the units of work, composing the transaction operating relatively-quickly; and ii) external readers and writers not being blocked during the progress of the transaction. This holds true, regardless of the duration of the transaction.

3) Support for global transactions. 
The transaction is coordinated by an external transaction manager which means multiple datastores or databases can be enlisted. As this is a general approach, it should be possible to mix the databases and datastore types. For example, an RDBMS and/or a JMS resource can also be enlisted in the global compensating-transaction as well as multiple NoSQL datastores. Furthermore, not all participants need to enlist as compensating resources. Those, which are more traditionally used with ACID transactions, can enlist as an ACID resource, using traditional XA. Here the ACID (XA) resources would experience the full ACID properties, with the compensating resources experiencing the relaxed-ACID properties of a compensating transaction.

4) A middleware solution that's simple for developers to use. 
Narayana offers an annotation-based API very similar to JTA 1.2, for using compensating transactions in your application. Furthermore, it comes pre-installed in WildFly 8, so you don't need to worry about complex setup. This API is discussed in more detail in the this blog series. Part 2 in this series will show how this API can be used to update two MongoDB documents within a compensating transaction.

Can this be done already with MongoDB?

The MongoDB documentation proposes a pattern for updating multiple documents in a relaxed-ACID transaction. This approach is similar to the Narayana approach in that they are both based on Sagas and result in similar interactions with the datastore. However, where the Narayana approach differs is that it provides a middleware-solution and so doesn't need to be developed within the application. Also, this approach is driven by a transaction manager, making it simpler for the transaction to span multiple resources.


Tuesday, May 27, 2014

Research Worth Knowing on CAP & ACID

The last couple of blog posts on NoSQL/SOA/large-scale and transactions got me thinking that maybe I hadn't mentioned another interesting research effort that also attempts to show how ACID transactions are possible in environments where some believe they aren't applicable. The work is being done under the banner of HA Transactions (HAT), and there's a more recent paper on the topic from VLDB 2014; they also talk about the use of different transaction models such as Sagas, which were part of the input to WS-TX and our REST transactions work. And of course there's always HPTS over the years!

Monday, May 26, 2014

Transactions and Microservices

I've written elsewhere that I think the term Microservices is really just referring to good SOA principles (why we need another term I really don't quite understand). But for whatever reason, articles and blog entries on Microservices seem to be in vogue at the moment. A recent entry on InfoQ goes into some depth on various aspects of what they are and how to use them. Unfortunately the author talks about transactions in the context of Microservices (aka SOA) and has this to say:

"One solution, of course, is to use distributed transactions. For example, when updating a customer’s credit limit, the CustomerService could use a distributed transaction to update both its credit limit and the corresponding credit limit maintained by the OrderService. Using distributed transactions would ensure that the data is always consistent. The downside of using them is that it reduces system availability since all participants must be available in order for the transaction to commit. Moreover, distributed transactions really have fallen out of favor and are generally not supported by modern software stacks, e.g. REST, NoSQL databases, etc."

Huh? This paragraph is wrong on so many levels that I really don't know where to start! For a start "generally not supported by modern software stacks"? Seriously?! Others have spoken about REST and transactions for years, but we've done our own work for over a decade! You also don't have to look too far on this blog for references to NoSQL and transactions (extended transactions or ACID). And of course there's Google's Spanner! ACID transaction support is a key part of this!

Over the years during the initial SOA, WS-* and REST debates kicking transactions out of the picture was a convenient thing for many people to do. Fortunately sanity and better understanding of where they can and should be used has seen them and their variants returning to these environments. I had hoped that those days were over, but it seems that with Microservices we're turning back the clock yet again. Oh well, time to dust off those old papers, blog posts etc. as it seems there's life left in them thanks to Microservices!

Friday, April 25, 2014

What's that we've been saying about transactions and NoSQL ..?

Well we've been saying for years that transactions are important and whilst not every use case needs them, some pretty important ones really do! Our very own Paul Robinson had something to say about this at DevNation the other week and his presentation will be on line very soon.