« Developer note: thinking about Java interfaces | Main | Tracing Java execution using a Dynamic Proxy »

Nov 29, 2007

Five ways for tracing Java execution

I'm often faced with a situation where I need to dig into code that I didn't write. Most of it is poorly documented and, in some cases, I only have some of the sources, which means I need to decompile the rest. It's a tedious exploration. There are mainly two reasons for that: maintaining a code or using it. The later refers to 3rd party frameworks and class libraries I need to build upon. For example I may implement an interface of an event handler and I may be uncertain regarding which event fires when and what are the exact arguments I'm getting.

I usually find that it is faster to trace the code at runtime. Especially when it comes to non-trivial links between classes, like an interface with multiple implementations which may be picked at runtime. The smörgåsbord of design time tools is just insufficient.

In this post, I will summarize the common methods (I know of) for tracing the runtime execution. The tips and examples are assuming you're using Eclipse, but most of them can be achieved with other IDEs as well. I will provide some more detailed information regarding some of the techniques mentioned below in future posts.


The Basic: Breakpoints and Step-by-step Execution

Start with the simplest way: set up breakpoints and start tracing your execution.

Pros Cons
  • Very simple to use and setup. Practically, all you need is a running debugger.
  • Provides access to ALL the information, including method attributes, local variables, etc.
  • Does not modify the code.
  • Selective: you can decide where to break.
  • Stepping through code just takes too long. Moreover, you need constant attention. For each step you should decide whether you're stepping in, over or out. This makes this method impractical.
  • When debugging event handlers one must setup breakpoints on all methods, since you're not sure which method will be called.
  • Impossible to set breakpoints inside method when you don't have the source code.

It is best when: You need a quick and simple solution, you have all the code and you know where you want to stop. You need elaborate information in a given point (arguments, local variables, etc.).

My tip: Even if you don't have all the sources, it's possible to break when entering methods or classes by setting-up method breakpoints or class loading breakpoints.

  • Class loading breakpoints are simple to set up: from Eclipse Run menu, select "Add Class Load Breakpoint...".
  • In order to set method breakpoints, you will need to decompile the class file first. Use the JadClipse plug-in for that. After you decompile, you may select the method and choose "Toggle Method Breakpoint" from the "Run" menu (or by right clicking the left bar in the file next to the method header). Note that you may set regular breakpoints in decompiled sources as well, but these will not work.

The Primal: Debug messages

We continue by setting up debug messages. The simplest way is to use System.out.println statements to print out messages to the console. The next step would be to use a logging mechanism: the basic JDK logging, Apache Commons Logging or Log4j.

Pros Cons
  • Very selective.
  • Speedy execution (in case you didn't put too many).
  • Allows printing the exact information you're interested in.
  • Requires code modification. This makes this impractical when you don't own the sources.
  • Takes time to set up. In case you're not sure what you're looking for, you'll need to introduce debug messages to many methods.
  • Makes the code dirty and introduces run-time overhead if you don't take the debug messages out afterwards.

It is best when: You own the code and you have a good idea of what you're looking for. Very good solution for event handlers. It's high performance makes it practical for understanding which event is fired when throughout the execution of complex flows.

My tip: It's easy to set up one generic method that prints out the method currently executing. The best way I'm familiar with is to get the stack trace. The last entry on the call stack will be the debug message method. The entry before that will be the method invoking it. How to obtain a stack trace:

  • Since JDK 5.0: Use Thread.currentThread().getStackTrace().
  • Before (JDK 1.4): create a new exception instance and use getStackTrace to obtain your call stack from the new instance (no need to "throw" it).

The Hot Shot: Dynamic Proxy

An improvement over simple debug messages. Dynamic Proxy is a special Java feature which enables the developer to introduce a proxy class, sitting in front of a given class and intercepting all the calls through a given interface. Once you intercept the calls, you may print out appropriate debug messages with all the information you require.

Pros Cons
  • No need to infest the code with debug messages. The debug messages can be done in one central location.
  • Simple to remove once you're done, thus, has no affect on performance.
  • Only works through interfaces. Useless for non-public methods and public methods which are not defined in an interface.
  • Requires code modification: you need to own the code.
  • Non selective messages printed.

It is best when: It's a great solution for event handlers. You can set up a dummy event handler with a generic proxy in seconds and see the sequence of events. This is simplest and quickest method when it comes to understanding event handlers.

My tip: You can find a code for a generic dynamic proxy on Sun's site. It's OK, but it has one small minus: it only implements the immediate interfaces implemented by the class of the instance being wrapped by the proxy. It does not implement any interface of its' ancestors. This can be easily fixed by traversing the list of ancestors and picking all the interfaces along the way. In one of the future posts I'll give a better version.


The Brute Force: Run-time Profiler

Profilers are powerful tools that trace all the calls in the system through special JVM hooks. However, it's like using a 10 pound hammer on a half inch nail.

Pros Cons
  • No code modification.
  • Can trace all the execution, even without source code.
  • Most tools provide the ability to filter data which is less interesting (like JDK calls, getters and setters, etc).
  • Most good tools are expensive.
  • Most tools take time to setup and master.
  • Profiling will greatly degrade the system performance. It's impractical to have a long execution with the profiler collecting information.
  • Most tools are geared towards performance measurement and neglect showing an execution path according to the actual time line.
  • Most tools provide the calls but not the arguments.

It is best when: You want a complete picture for a very specific operation (i.e. very short execution flow).

My tip: The good tools cost around $500 per seat. Most free tools are very basic. IMHO, The best free open-source tool for the job is Eclipse Testing and Performance Tools Platform (TPTP). I've had very good experience with the tool and I highly recommend it. It can  provide a sequence diagram which is one of the best ways for understanding execution. TPTP used to be my favorite before I switched to the Mac and found it's incompatible (maybe I'll try to fix it if I'll have some spare time).


The New Age: Aspects

Aspect Oriented Programming (AOP) is a non-trivial idea. Without going into the concept of Aspects, I'm just looking at the bottom line: it's a quick and easy way of intercepting the execution of your code. You can selectively set hooks around methods, constructors, field access, etc., without modifying the original code. In these hooks you can print debug messages.

Pros Cons
  • Very quick and easy to setup.
  • Selective: you can trace just the classes or method you require.
  • Provides all the information, including arguments.
  • No modification to the original code. Very easy to remove once you're done.
  • Good performance. May be used in long executions.
  • Requires some basic understanding of AOP. Especially for setting up more complex filters.
  • Requires rebuilding your code. This can be a problem when you need to trace a code found in a packaged JAR.

It is best when: You want to trace the execution of a code you can rebuild.

My tip: Use it. AOP is still far from the mainstream. It is uncertain whether it will ever get there. Without debating its' qualities and features, most developers can be up and running with AOP based logging in about an hour. I'll show the exact steps in one of the future posts. For Eclipse users, I recommend the Aspect Java Development Toots (AJDT).


Conclusion

What method should you choose? I'll divide this into two different cases: tracing the complete execution of a short sequence and tracing events to/from a given class or tier.

When tracing the complete execution:

  1. A profiler is the simplest way. IMHO, every developer should be able to operate a profiler and have one readily available in the arsenal.
  2. The next choice would be AOP. It's the second choice only because setting up a profiler is simpler when it comes packaged JARs. However, AOP gives more control. For example, I can see the method arguments and print them.
  3. Use debug messages.

When tracing events:

  1. Use AOP. Selectively capture just the interesting calls.
  2. Use a Dynamic Proxy. It's like using debug messages, but a more elegant solution. AOP may be more complex first, but in the long run I would go for AOP.
  3. Use debug messages.
  4. Use breakpoints.

As you can see, I'm not a big fan of breakpoints in this context. Breakpoints are great when debugging your code, but not for tracing execution. There are far more efficient methods of doing that.

Update: as some readers pointed out, AspectJ can be used for weaving packaged JARs. I will explore this option and AspectJ in a future post.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2460116/23741862

Listed below are links to weblogs that reference Five ways for tracing Java execution:

» Home from Confluence: Java? JEE
?? ?? ????? [Read More]

» [번역] 자바 실행코드를 추적하는 5가지 방법 from 생각하고 나누고 공감하기...
Zviki Cohen씨의 Five ways for tracing Java execution을 번역했습니다. ---------------------- 난 종종 내가 만들지 않은 코드의 속을 파고 들어야 하는 상황에 직면하곤 한다. 이들 대부분은 문서가 형편 없고 어떤 경우는 소스의 일부만을 가지고 있는데 이는 나머지 부분을 디컴파일 해야 하는 것을 의미한다. 이건 지겨운 탐험이다. 이런 일을 해야 할 이유 중 큰 두가지는 코드를 유지보..... [Read More]

» Fünf Wege der Fehlersuche in Java from theserverside.de
Oft steht man vor dem Problem, sich in fremden Sourcecode einarbeiten zu müssen. Meistens ist dieser auch noch spärlich dokumentiert und nicht selten hat man nur einen Teil davon vorliegen und muss den Rest dekompilieren. Diesem anstrengenden Prozess... [Read More]

Comments

Hi,

I think you have forgotten a few other options. Many frameworks and middleware technologies offer interceptor like callbacks. For example CORBA offers a portable interceptor that can enable tracing across a complete distributed environment. There are other Java examples. ServletFilter!

It would be rather naive to build a solution based on one single approach ignoring the obvious built in support for tracing within some frameworks especially when it can be contextual rather than code based and span multiple processes.

By the way you really should separate the instrumentation technique for the actual tracing and data collection system.

At the end tracing is relatively straightforward what is not is the creation of a supporting runtime that has low overhead but collects a large amount of contextual data and performs automatic analysis.

Kind regards,

William Louth
http://blog.jinspired.com

Using AOP does not necessarily require rebuilding the code (compile-time weaving), it can also be done dynamically at class loading (load-time weaving), and there are many tools available (commercial, free and open-source) that utilize the technology.

Charlie Weiblen
www.performanceengineer.com

The easiest and best solution may well be to use AOP in a Spring application. 0 existing code change (you add some + configuration).

As William Louth said, ServletFilters are powerful indeed! See for example my own monitoring solution (shameless plug :-)) MessAdmin http://messadmin.sourceforge.net for the power they enable.

Nice post! Provides some good info for people evaluating different approaches.

In the proxy part, you might want to mention that it does introduce a little overhead. It's usually small compared to the benefit that you get from it, but still significant. i wrote a small article about that once on our blog: http://blog.interface21.com/main/2007/07/19/debunking-myths-proxies-impact-performance/

cheers,
Alef

Sorry Cedrik but your example does clearly point to the truth in second part of my posting.

Instrumenting is relatively straight forward with current frameworks (Spring, Guice) and technologies (Aspectj, Dynamic) but doing this with low overhead whilst collecting and recording a rich performance model that spans much more than a one technology layer (servlets) is a different and difficult dragon to slay. Not to mention tooling that works across clusters that can scale from measuring some code snippet to a large grid computing platform.

William

Guys, thank you for the great comments.

I will check out the options of weaving existing JARs. I'm doing Eclipse plug-in development and I'm just wondering how this is going to work considering that dependent plug-ins are not added as JARs to the build path, but rather in the Manifest as a required bundle. I'll update my post. I'm going to do another post with a short tutorial on using Aspects for logging in the near future.

The final goal of the techniques described is to understand the running code. I did not want to go into performance measurement, which is a different problem.

One of the reasons I'm writing such posts is to learn what other people are thinking. Again, thanks for all the comments.

The combination of a Java 6 agent, on-the-fly bytecode instrumentation, and a scripting language, can be considered as a different category in your classification of execution tracing methods.
The tool I wrote (http://www.adaptj.com ) does exactly that. Look at the intercept and the breakpoint commands.
It allows intercepting methods, changing their arguments, return value, and exceptions.
There is no need to start your application's JVM with any special flags or options. It works without any code changes.

Todor

If you are going to start changing arguments you might as well fire up a debugger.

No one is going to allow such a tool be used in a production environment and to be honest it would be impractical in a concurrent environment. So what's the problem with using a debugger within a IDE or simply just looking at the code and see the paths which is what I have had to learn to do for sometime as prior to Java 5 you could not load/run a profiler and debugger at the same time.

Now coming back to tracing. The problem with a large percentage of the tools I see being built again and again is that the focus is too much on the code itself rather than the actual contextual data (JNDI names, component names/alias, SQL commands, EJBQL/JCR queries, URL....). Attaching context is to a trace is extremely important and is flattening execution paths via local loop and global loop pattern compression. Another issue is there is no timeline analysis other than primitive log analysis - events and composite events (think CEP) help make the understanding possible.

An operations person is hardly likely to allow an -javaagent to be loaded into a JVM unless it has already being tested and sanctioned for a release of particular application. At that stage you might as well have added the agent to the command line especially as most agents can be disabled initially.

William

Well, just try it yourself, and see whether or not it works for your application.
Start the web start application. Attach a BeanShell console to your application.
Copy/paste the following class definition:
class Advice {
enter(m) { print(m.name); }
}

Intercept all methods in some class:

intercept("my.package.MyClass", ".*", new Advice());

It is that simple.
Feel free to send any comments to support@adaptj.com

Todor

Hi Todor,

It is not a question of whether it works. Its about the fact that it is a untested change which in the real-world of enterprise development is rarely allowed unless it is completely hidden from operations via a backdown or there is no management process in place. In my world getting an configuration setting changed involves communication, agreement, and co-ordination across a number of teams & departments. You would be quicker firing up the debugger if you did not have a diagnostics solution already in place.

Regards,

William

What about LimpidLog?
http://www.acelet.com/limpidlog/
"You do not need to hard code log statements. At runtime, you can register a class of interest and the system will log all important information about the class. When you are done, you can de-register the class and the class will not generate any log information."

For this purpose, I use DTrace. It gives unparalled access to information inside the Java stack and also below in the system calls. Unfortunately, DTrace is only available on Solaris, Mac OS-X Leopard, FreeBSD and OpenBSD. I have a laptop with Solaris just for this purpose.

Hi Steven,

Does this "unparalleled access" include the ability to relate database operations (SQL) statements with EJBQL/JPQL queries within a (distributed) transaction context possibly spanning multiple threads and processes. Tracing should be contextual providing a model that merges the code flow model with higher level tasks/activities/processes.

DTrace does have very interesting data collection capabilities but I think that the level is not practical in the context of understanding an application. It is more focused on the system execution model and not the software execution model.

William

Steven, Could you narrow down the platform short list you provided in relation to the Java support also stating which version of Java is actually supported.

William

Hey, so you haven't heard of MaintainJ(http://maintainj.com). MaintainJ uses AspectJ to trace either Java code or jar files to generate sequence and class diagrams. No coding or code changes are required - it is very quick and easy to generate the diagrams.

MaintainJ generated sequence diagrams are what a developer needs to understand the call flow - loop and recursive calls are removed, diagrams can be restricted to certain packages. One can explore the sequence diagrams with expand / collapse call options. MaintainJ also supports multi-threaded application. Check the home page and video demos for more info.

Before William jumps in, yes, it is very helpful to have context attached to the runtime sequence diagram and that is missing in MaintainJ at the moment. And the runtime weaving used by MaintainJ adds some runtime overhead, though it is very little.

Choudary Kothapalli.
http://MaintainJ.com

Hi Zviki,
(Just came across this post)

Sometimes I find placing conditional breakpoints with a println() statement very useful when I need to trace the execution but can't modify the source code.

(Very similar to .NET's tracepoints)

Genady

Post a comment

If you have a TypeKey or TypePad account, please Sign In