Friday, January 10, 2014

Introducing the "Narayana Transaction Analyser"

In this post I'll interview Paul Robinson, who is the project lead of the new "Narayana Tranasaction Analyser" project. In the interview we'll aim to provide an introduction to the tool and help you get a feel for how it could help you as an application developer.

Tom: Can you provide some background for the tool?

Paul: Back when I was a JBoss Transactions consultant, my clients and I often found it difficult to discover the cause of failing transactions. Since then I've felt that it would be great to have tooling that tells me everything about all the transactions ran in my application server. With this information I would be able to see exactly why my transactions are not behaving as I would like. I could also use this information to understand more about my architecture, by displaying a high level topology of servers and transactional resources involved in my transaction.

Last summer we hired Alex Creasy as an Intern to work on a prototype of this tool. From his excellent work the "Narayana Transaction Analyser" was born. Since Alex completed his prototype, we have promoted it to a project under the 'Narayana' umbrella and produced our first release. Today, I'd like to focus on providing an overview of what we aim to achieve with this tool. I'll follow up with a subsequent post, focusing on what features we have in the recent 1.0.0.Alpha1 release and how to get started.

Tom: What are the goals of this tool?


Paul: The main requirement of the Transaction Analyser is to make it significantly simpler to diagnose transaction-related problems. As well as providing detailed information on every transaction, the tool can also be loaded with a suite of plugins that diagnose common issues. It should also be possible to export this data. This exported data can then be uploaded alongside a support ticket or forum posting, giving the person providing the assistance more data to work with. 


Tom: When would I use this tool?


Paul: In general, the tool should be enabled when you are experiencing some transaction-related issues and you require more information. You need to be mindful of when this tool is enabled as gathering this data does impose an overhead on the system. Think of this tool as being similar to a performance profiler, like JProfiler. You just enable this tool when you detect an issue that requires more investigation.

Tom: Sounds interesting, can you give me some examples of what I could use this for?


Paul: The following list should give you a feel for what type of issues the tool can investigate.

Many of your transactions are rolling back, and you don't know why.

This often occurs when a timeout is triggered due to business logic taking too long to complete. The tool lists all transactions that were rolled back due to timeout. The tool may also be able to provide details on what the business logic was doing, making it easier to track down the root cause. For example, any JPA queries ran within the transaction could be displayed.

You have a distributed transaction crossing many servers, and you're finding it difficult to correlate the many log files. 

The tool is distributed-transaction aware and groups together all the data from a single transaction that spans multiple servers. Currently the focus is on supporting JTS, but gathering data on Web Service and REST transactions is possible.

You have a heuristic transaction, but you don't know which resource misbehaved. 

The tool shows all resources enlisted in the transaction and provides details on how they behaved in the transaction. It is relatively simple to see which resource didn't behave as instructed.

A transaction appears to have 'hung', but you don't know why. 

This often occurs due to deadlock when trying to obtain a lock on some resource. This is the type of issue that often requires expert knowledge to track down and requires a good understanding of the log output. However, this process can be automated by this tool; notifying the user when the problem is detected. As well as notifying the user, the tool can also link to some useful documentation explaining how to fix the problem.

Someone is assisting you with an issue and they would like some more details. 

Providing this person with a dump from this tool will provide them with a wealth of information, hopefully making it easier for them to assist you. After fixing the problem, a plugin could be developed so that the issue is automatically detected whenever anyone else experiences that issue in the future.

Tom: How do I try it?


Paul: We recently released NTA 1.0.0.Alpha1. In this blog post I walk through the current feature set and how to get the tool up-and-running.

No comments: