Sunday, February 19, 2012

Optimistic STM

It's been a while since I last posted anything on the Software Transactional Memory work that's been going on in JBossTS (aka Narayana). At that time we had everything in place for you to write transactional applications that have all of the ACID properties or, more common for STM, without the D. Per object concurrency control is done through locks and type specific concurrency control is available. You can define locks on a per object and per method basis, and combined with nested transactions this provides for a flexible way of structuring applications that would typically not block threads unless there is really high contention:

@Transactional

public class SampleLockable implements Sample

{

public SampleLockable (int init)

{

_isState = init;

}

@ReadLock

public int value ()

{

return _isState;

}


@WriteLock

public void increment ()

{

_isState++;

}

@WriteLock

public void decrement ()

{

_isState--;

}


@State

private int _isState;

}


If you recall from previous articles, all but the @Transactional annotation are optional, with sensible defaults taken for everything else including locks and state. And you create and use instances fairly simply:


RecoverableContainer theContainer = new RecoverableContainer();

Sample obj1 = theContainer.enlist(new SampleLockable(10));


However, the locking strategy we had originally was pessimistic. As I mentioned separately:

"Most transaction systems utilize what is commonly referred to as pessimistic concurrency control mechanisms: in essence, whenever a data structure or other transactional resource is accessed, a lock is obtained on it as described earlier. This lock will remain held on that resource for the duration of the transaction and the benefit of this is that other users will not be able to modify (and possibly not even observe) the resource until the holding transaction has terminated. There are a number of disadvantages of this style: (i) the overhead of acquiring and maintaining concurrency control information in an environment where conflict or data sharing is not high, (ii) deadlocks may occur, where one user waits for another to release a lock not realizing that that user is waiting for the release of a lock held by the first."

And the obvious alternative to this approach is optimistic:

"Therefore, optimistic concurrency control assumes that conflicts are not high and tries to ensure locks are held only for brief periods of time: essentially locks are only acquired at the end of the transaction when it is about to terminate. This kind of concurrency control requires a means to detect if an update to a resource does conflict with any updates that may have occurred in the interim and how to recover from such conflicts. Typically detection will happen using timestamps, whereby the system takes a snapshot of the timestamps associated with resources it is about to use or modify and compares them with the timestamps available when the transaction commits."

Therefore, you can probably guess what's just been added to our STM implementation? Yes, that's right ... optimistic concurrency control! In line with the annotation based approach we've used so far, there are now two new annotations: @Optimistic and @Pessimistic, with Pessimistic being the default. These are defined on a per interface basis and define the type of concurrency control implementation that is used whenever locks are needed:

@Transactional

@Optimistic

public class SampleLockable implements Sample

{

public SampleLockable (int init)

{

_isState = init;

}

@ReadLock

public int value ()

{

return _isState;

}


@WriteLock

public void increment ()

{

_isState++;

}

@WriteLock

public void decrement ()

{

_isState--;

}


@State

private int _isState;

}


And that's it. No other changes are needed to the interface or to the implementation. However, at present there is a subtle change in the way in which you create your objects. Recall how that was done previously and then compare it with the style necessary when using optimistic concurrency control:

RecoverableContainer theContainer = new RecoverableContainer();

Sample obj1 = theContainer.enlist(new SampleLockable(10));

Sample obj2 = theContainer.enlist(new SampleLockable(10),

theContainer.getUidForHandle(obj1));


In the original pessimistic approach the instance obj1 can be shared between any number of threads and the STM implementation, along with JBossTS, will ensure that the state is manipulated consistently and safely. However, with optimistic concurrency we need to have one instance of the state per thread. So in the above code we first create the object (obj1) and then we create a copy of it (obj2), passing a reference to the original to the container.

It is likely that the API for this will change soon in order to unify optimistic and pessimistic: the aim is to make it as opaque as possible to the application so you only have to modify an annotation when you want to change the implementation. One other thing that we're considering is changing the implementation dynamically, perhaps based on monitored metrics for contention. But for now that's it and it's all available in the Narayana git repository.

Thursday, January 12, 2012

connecting the dots

Hmm, looks like it's 2012 already, which means we're overdue for another rant. Today's lecture topic is the perils of trying to replace or do without bits of code that you don't understand.

Programming in the large is all about abstraction, loose coupling and separation of concerns. Java EE promotes the ability to use complex functionality without understanding how it is implemented. As long as you follow the instructions this works surprisingly well. With an app server at your disposal you require only the sketchiest understanding of distributed transactions to write code that consumes a message and updates a database in an ACID fashion. Which on the whole is a good thing, since not all that many people actually want to understand all the plumbing that makes it work. There is a snag though: this apparent simplicity can cause some users to think the Java EE container is not doing all that much work and can easily be replaced or done without. Likewise it makes it difficult to tell the difference between a full fledged container that will support the functionality you need and a cut down framework or lightweight container that may not.

Let's debunk a few myths...

Spring is not a JTA.

It's an abstraction layer on top of a transaction manager, not an implementation of one. For transactions involving only a single resource, it simply delegates the hard bits to that resource manager - so called 'native transactions'. Simple, fast and mostly adequate if you need only a single resource. You don't get transaction lifecycle events, but you're probably using an ORM that provides the equivalent of the beforeCompletion hook for its cache anyhow.

For transactions with more than one resource, you need to wire in a real transaction manager to Spring. You can do this with bitronix, atomikos or JBossTS, usually just by specifying the right TransactionManager and UserTransaction implementation bean classes. But your problems don't end there, because...

Spring is not a JCA.

The roles, responsibilities and relationship between the JTA and JCA components of an app server are critical considerations when you're trying to do without one. The JTA manages transaction lifecycle - begin/commit/rollback. Most importantly, it make appropriate calls on any enlisted XAResources as the transaction progresses. But here is the bit that most users don't pay attention to: A JTA does not magically know what resources you want to participate in the transaction. Telling it that is the JCA's job.

In a full on app server, the JCA manages connections to resource managers such as databases and message queues. If you deploy those drivers/connectors in a manner that identifies them as XA enabled, the JCA ensures that they are correctly associated with the transaction. Application code simply e.g. looks up the JNDI name for a connection pool and calls getConnection(). The JCA intercepts the call, get the XAResource for the connection and passes it to the transaction manager.

In some cases you don't need a full JCA. You can often make do with a transaction manager aware XA connection pool, which is essentially a subset of the JCA functionality. But you can't get away with only an XA aware driver or a non-XA connection pool. Trying to do that leads to some interesting behaviour: your app will deploy and run, but you have a transaction and a connection that know nothing about one another. Committing or rolling back the transaction won't commit or rollback the work in the database. oops.

So, you also need to wire in a JCA or suitable connection pooling implementation. Most 'standalone' JTA implementations ship with a simple connection management solution that is suitable for light use. The one in JBossTS is called the TransactionalDriver. For serious deployments you want IronJacamar or some other JCA that has robust and fast connection management.

So now you have wired up your JTA and JCA in Spring, but you are still not done because...

The standard contract between JTA and JCA does not include recovery management setup. Wiring up resources for crash recovery requires a proprietary solution that differs for each transaction manager. The connection manager that ships with the transaction manager may do this more or less automatically, but third party JCAs or XA aware connection pools probably won't. So, go read the transaction manager documentation and write a few test cases.

phew, that was a lot of work, wasn't it? Spring is good at what it does, but its not an out of box replacement for all the transactional plumbing in a Java EE app server. Nor is tomcat - you'll have much the same steps to perform there. In both cases it is possible, but not as simple as unzipping and starting JBossAS. Do you really want to make your life harder than it has to be?

Sunday, January 1, 2012

Transactional Android coming soon!

I spent some time this Christmas porting JBossTS to run on Android. It's pretty much done, with the exception of a few workarounds that I need to fix properly over the next few weeks, when I find the time. But once I check this code into the repository, you'll be able to write your own transactional Android applications. Unfortunately I can't guarantee which version this will be in at the moment, but if it is going to take me too long to do the merges then I may create a branch in svn that interested people can just pull from directly, with the usual caveats. If time allows, I may say something about this at JUDCon India too!

Friday, December 23, 2011

Transactions making a comeback? They were never away!

Over the years that I've been involved with transaction processing theory and practice (way too many years for me to admit!) I've seen transactions used, abused and ignored in a wide variety of applications and situations. I've discussed many of these scars in articles here and elsewhere for decades (ouch! that pains me to even use such a timeframe!) But every new software wave, whether it's Web Services, object-orientation, and even the Web itself, has come to the inevitable conclusion that transactions are needed in some way and in one form or another. Maybe not ACID transactions; maybe it's compensation based, or some other variation on the extended transaction theme.

So it is nice to see that being repeated in the past couple of years with Cloud and NoSQL. If you read this blog enough you'll have heard from myself and others on the fact that although some believe that you need to ditch transactions in order to achieve scalability, others aren't quite convinced that it is always necessary to do so - and I count us amongst the latter group. Complete reliance on transactions is sometimes too much. However, completely ignoring them and pushing consistency and recovery up to the application is sometimes too much as well. This is something that appears to be dawning on a number of transaction-less implementations and hopefully it's a trend that will continue.

Anyone who knows me or has followed things I've written or presented over the years will also know that I think transactions are a great structuring technique for concurrent programming. Of course in distributed systems we tend to take this for granted: you've inherently got multiple clients/users manipulating your data, typically at the same time, so transactions with their ACID properties allow developers to concentrate on the functional aspects of the application or service, whilst letting the transaction system do the hard work on ensuring isolation and consistency in the presence of these concurrent users (and possibly failures).

But distribution isn't the only place where you get concurrency, and particularly these days with multi-core processors. Back in the last century (ouch!) some of us in academia and industry, had access to multi-processor machines which weren't as common as you might expect (they were expensive!) But when you had them it quickly became apparent that transactions could help with developing applications that had no distribution in them but were (massively) parallel. From this early work techniques such as software transactional memory were born. Back then it was more of an edge case and people found it hard to understand why you'd need transactions without distribution or even a database involved. Well obviously the advances in hardware have silenced most of those critics and we're seeing more and more vendors, open source projects, languages etc. looking at transactions are a fundamental component and not just an add-on for some scenarios.

So what does all of this mean? Well first of all I think it's great to see transactions continuing to have an impact in these new waves. Fundamental requirements like fault tolerance, concurrency control, security etc. are fundamental for a reason. Secondly I think it's fair to say that as with previous waves, we'll see transaction theory and implementations adapt and change to better address some of the new concerns and requirements that are bound to arise. I'm excited by all of this, as I am whenever there's something new to apply transactions. I'm also excited by the fact that JBossTS (implementation and team) is well placed to help drive some of this as we've done for ... well ... let's just say for quite a long time and leave it at that!

Monday, December 19, 2011

my last commit

Today I created the tag for the JBossTS 4.16.0.Final release, thereby making my last commit to the JBossTS repository as team lead.

It is the end of an era in more ways than one, as 4.16 is planned to be the last feature release for the 4.x line and the last for the JBossTS brand.

Starting in the new year, the project gets a new name, Narayana, a new major version, 5.0, and a new development team lead, Tom Jenkinson. So, exciting times ahead for the project in 2012 and beyond, with fresh blood and fresh ideas.

As for myself, I start 2012 with a shiny new JBoss role looking at Big Data and noSQL. Who knows, there may even be a new blog for you to subscribe to. But first there is the small matter of a seasonal vacation...

Merry Christmas

Jonathan.

Friday, November 11, 2011

HPTS 2011

I've uploaded most of the presentations and posters sessions for HPTS 2011 to the website now. If you interested in transactions, NoSQL, eventual consistency, data provenance and a whole raft of topical subjects, then you really should check them out!

Thursday, November 10, 2011

unnecessary

Hot on the heals of thinking about information quality, I came across this little gem regarding Spring and JPA configuration:

"Using multiple session factories in a logical transaction imposes challenging issues concerning transaction management... We quickly abandoned [JTA transaction management] due to the potential cost and complexity of an XA protocol with two-phase commit. Since most of the modules of an application share the same data source, the imposed cost and complexity are definitively unnecessary."

The authors then go on to describe how they built a custom solution that causes the same database connection to be used by the modules, removing the need for transaction coordination across multiple connections.

"Extending Spring’s transaction management with SessionFactory swapping and the Shared Transaction Resource pattern was a very challenging task."

That last bit should probably have read "challenging and unnecessary task".

A good JCA will automatically track connections enlisted with a JTA transaction and will reuse the already enlisted connection to satisfy a further dataSource.getConnection() request. Further, even if it does enlist multiple connections, the JTA will use isSameRM to detect that they relate to the same resource manager and thus still maintain the one phase commit optimisation. All of these challenging tasks are taken care of for you by the application server.

You probably should not bother to invent a better mousetrap until you've determined that current mousetraps don't catch your mice. The imposed cost and complexity are definitively unnecessary.