« October 2007 | Main | December 2007 »

November 2007

Nov 29, 2007

Five ways for tracing Java execution

I'm often faced with a situation where I need to dig into code that I didn't write. Most of it is poorly documented and, in some cases, I only have some of the sources, which means I need to decompile the rest. It's a tedious exploration. There are mainly two reasons for that: maintaining a code or using it. The later refers to 3rd party frameworks and class libraries I need to build upon. For example I may implement an interface of an event handler and I may be uncertain regarding which event fires when and what are the exact arguments I'm getting.

I usually find that it is faster to trace the code at runtime. Especially when it comes to non-trivial links between classes, like an interface with multiple implementations which may be picked at runtime. The smörgåsbord of design time tools is just insufficient.

In this post, I will summarize the common methods (I know of) for tracing the runtime execution. The tips and examples are assuming you're using Eclipse, but most of them can be achieved with other IDEs as well. I will provide some more detailed information regarding some of the techniques mentioned below in future posts.


The Basic: Breakpoints and Step-by-step Execution

Start with the simplest way: set up breakpoints and start tracing your execution.

Pros Cons
  • Very simple to use and setup. Practically, all you need is a running debugger.
  • Provides access to ALL the information, including method attributes, local variables, etc.
  • Does not modify the code.
  • Selective: you can decide where to break.
  • Stepping through code just takes too long. Moreover, you need constant attention. For each step you should decide whether you're stepping in, over or out. This makes this method impractical.
  • When debugging event handlers one must setup breakpoints on all methods, since you're not sure which method will be called.
  • Impossible to set breakpoints inside method when you don't have the source code.

It is best when: You need a quick and simple solution, you have all the code and you know where you want to stop. You need elaborate information in a given point (arguments, local variables, etc.).

My tip: Even if you don't have all the sources, it's possible to break when entering methods or classes by setting-up method breakpoints or class loading breakpoints.

  • Class loading breakpoints are simple to set up: from Eclipse Run menu, select "Add Class Load Breakpoint...".
  • In order to set method breakpoints, you will need to decompile the class file first. Use the JadClipse plug-in for that. After you decompile, you may select the method and choose "Toggle Method Breakpoint" from the "Run" menu (or by right clicking the left bar in the file next to the method header). Note that you may set regular breakpoints in decompiled sources as well, but these will not work.

The Primal: Debug messages

We continue by setting up debug messages. The simplest way is to use System.out.println statements to print out messages to the console. The next step would be to use a logging mechanism: the basic JDK logging, Apache Commons Logging or Log4j.

Pros Cons
  • Very selective.
  • Speedy execution (in case you didn't put too many).
  • Allows printing the exact information you're interested in.
  • Requires code modification. This makes this impractical when you don't own the sources.
  • Takes time to set up. In case you're not sure what you're looking for, you'll need to introduce debug messages to many methods.
  • Makes the code dirty and introduces run-time overhead if you don't take the debug messages out afterwards.

It is best when: You own the code and you have a good idea of what you're looking for. Very good solution for event handlers. It's high performance makes it practical for understanding which event is fired when throughout the execution of complex flows.

My tip: It's easy to set up one generic method that prints out the method currently executing. The best way I'm familiar with is to get the stack trace. The last entry on the call stack will be the debug message method. The entry before that will be the method invoking it. How to obtain a stack trace:

  • Since JDK 5.0: Use Thread.currentThread().getStackTrace().
  • Before (JDK 1.4): create a new exception instance and use getStackTrace to obtain your call stack from the new instance (no need to "throw" it).

The Hot Shot: Dynamic Proxy

An improvement over simple debug messages. Dynamic Proxy is a special Java feature which enables the developer to introduce a proxy class, sitting in front of a given class and intercepting all the calls through a given interface. Once you intercept the calls, you may print out appropriate debug messages with all the information you require.

Pros Cons
  • No need to infest the code with debug messages. The debug messages can be done in one central location.
  • Simple to remove once you're done, thus, has no affect on performance.
  • Only works through interfaces. Useless for non-public methods and public methods which are not defined in an interface.
  • Requires code modification: you need to own the code.
  • Non selective messages printed.

It is best when: It's a great solution for event handlers. You can set up a dummy event handler with a generic proxy in seconds and see the sequence of events. This is simplest and quickest method when it comes to understanding event handlers.

My tip: You can find a code for a generic dynamic proxy on Sun's site. It's OK, but it has one small minus: it only implements the immediate interfaces implemented by the class of the instance being wrapped by the proxy. It does not implement any interface of its' ancestors. This can be easily fixed by traversing the list of ancestors and picking all the interfaces along the way. In one of the future posts I'll give a better version.


The Brute Force: Run-time Profiler

Profilers are powerful tools that trace all the calls in the system through special JVM hooks. However, it's like using a 10 pound hammer on a half inch nail.

Pros Cons
  • No code modification.
  • Can trace all the execution, even without source code.
  • Most tools provide the ability to filter data which is less interesting (like JDK calls, getters and setters, etc).
  • Most good tools are expensive.
  • Most tools take time to setup and master.
  • Profiling will greatly degrade the system performance. It's impractical to have a long execution with the profiler collecting information.
  • Most tools are geared towards performance measurement and neglect showing an execution path according to the actual time line.
  • Most tools provide the calls but not the arguments.

It is best when: You want a complete picture for a very specific operation (i.e. very short execution flow).

My tip: The good tools cost around $500 per seat. Most free tools are very basic. IMHO, The best free open-source tool for the job is Eclipse Testing and Performance Tools Platform (TPTP). I've had very good experience with the tool and I highly recommend it. It can  provide a sequence diagram which is one of the best ways for understanding execution. TPTP used to be my favorite before I switched to the Mac and found it's incompatible (maybe I'll try to fix it if I'll have some spare time).


The New Age: Aspects

Aspect Oriented Programming (AOP) is a non-trivial idea. Without going into the concept of Aspects, I'm just looking at the bottom line: it's a quick and easy way of intercepting the execution of your code. You can selectively set hooks around methods, constructors, field access, etc., without modifying the original code. In these hooks you can print debug messages.

Pros Cons
  • Very quick and easy to setup.
  • Selective: you can trace just the classes or method you require.
  • Provides all the information, including arguments.
  • No modification to the original code. Very easy to remove once you're done.
  • Good performance. May be used in long executions.
  • Requires some basic understanding of AOP. Especially for setting up more complex filters.
  • Requires rebuilding your code. This can be a problem when you need to trace a code found in a packaged JAR.

It is best when: You want to trace the execution of a code you can rebuild.

My tip: Use it. AOP is still far from the mainstream. It is uncertain whether it will ever get there. Without debating its' qualities and features, most developers can be up and running with AOP based logging in about an hour. I'll show the exact steps in one of the future posts. For Eclipse users, I recommend the Aspect Java Development Toots (AJDT).


Conclusion

What method should you choose? I'll divide this into two different cases: tracing the complete execution of a short sequence and tracing events to/from a given class or tier.

When tracing the complete execution:

  1. A profiler is the simplest way. IMHO, every developer should be able to operate a profiler and have one readily available in the arsenal.
  2. The next choice would be AOP. It's the second choice only because setting up a profiler is simpler when it comes packaged JARs. However, AOP gives more control. For example, I can see the method arguments and print them.
  3. Use debug messages.

When tracing events:

  1. Use AOP. Selectively capture just the interesting calls.
  2. Use a Dynamic Proxy. It's like using debug messages, but a more elegant solution. AOP may be more complex first, but in the long run I would go for AOP.
  3. Use debug messages.
  4. Use breakpoints.

As you can see, I'm not a big fan of breakpoints in this context. Breakpoints are great when debugging your code, but not for tracing execution. There are far more efficient methods of doing that.

Update: as some readers pointed out, AspectJ can be used for weaving packaged JARs. I will explore this option and AspectJ in a future post.

Nov 26, 2007

Developer note: thinking about Java interfaces

Last week I gave another training. The group consisted of mostly fresh developers. One of the characteristics of such developers, is misunderstanding the Interface concept. When I say "interface" I'm referring to Java interfaces (interfaces were not part of the original OOP concept). In my opinion, understanding and using interfaces correctly is a positive sign of a mature and experienced enterprise developer.

Interfaces are the path which leads to the decoupling heaven. A developer working on his/her own or in a very small team cannot fully comprehend the importance of true decoupling of components. For me, I think it was COM which made the difference. COM, Microsoft's Component Object Model, defines a platform and a protocol for connecting binary components. Binding binary components means that the components are compiled to a binary form, hence, can be written in virtually any language. The icon representing an object (or component) in COM was a box with "handles" which represent the interface. An object can only be handled by the interface.

COM Object Icon

Obviously, this is not the case in Java. Java allows the tight coupling of objects. This leaves the responsibility to the developer.

Three thumb rules to achieve implementation independence:

  • Return interfaces from your methods - do not return an actual implementation. This is especially important when dealing with public methods.
  • When casting - cast to an interface rather than an implementation.
  • All public methods should be defined in an interface. All other "implementation" methods should be kept private or protected.

This can bring you to the decoupling utopia... but it can also drive you crazy. Because:

  • You will need to write and manage more code.
  • Program structure becomes more complex.
  • It's difficult to trace the program execution path when statically browsing the code.

The last point is a tough one. Modern IDEs can easily show you a "Call Hierarchy" view or simply jump to the method body. However, when working with interfaces this feature is disabled, since the execution refers to the interface and not the implementation. Here's an example of a Call Hierarchy view from Eclipse:

Call Hierarchy

The solution is probably in a middle way. The developer needs to decide where it is necessary to decouple and where this is just a waste of time. It is difficult to give a well defined set of rules for the decision, hence, different people in the same team may come to different conclusions.

Final note: if the tools were good enough to work with interfaces intuitively, would you be using them to their full extent?  I know I would.

Nov 19, 2007

OmniFocus: the next best thing for task management

I admit, I have a tendency to forget stuff. My attention can easily jump from one task to another and there may be casualties in the process. That's why I like to organize my tasks in lists. However, I have yet to find a task manager which fits the way I think. Until now.

My main problem with existing task managers is their inability to create outlines, a multi-leveled structure. I was trained to think in outlines. I write in bullets which have sub-bullets. Making me think of tasks as a flat list, all at the same level, confuses me. The span of items I can see and contain is limited. Thus, if the list is too long I just get stuck. I must break it into a hierarchy which I can visualize and comprehend. It works the same way with code, by the way.

That's exactly what makes OmniFocus different and a clear winner, from my point of view. It has an incredible flexibility when jotting down the tasks outline. Creating tasks is as easy as writing in bullets with a word processor. On top of that, OmniFocus adds very useful views for planning and execution. You can focus on a project or a subtask, or focus on your next actions. It also complies to the GTD method (Getting Things Done), which seems to be very trendy nowadays.

OmniFocus is the latest application from The Omni Group. It's a small software company which makes software solely for the Mac platform. I got OmniOutliner bundled with my Mac and it quickly became one of my essential tools. It is just plain useful. I can baldly say that the Omni products are a strong reason to switch over to the Mac. Surprisingly, you will not find parallel tools in Windows.

OmniFocus is currently in private alpha and can be pre-ordered for 50% discount. If you use a Mac and you're swamped with tasks, this is something you need to check out.

Nov 09, 2007

This is guaranteed to make a developer laugh

I have neglected my blog in the passing two weeks or so. I have some ideas for posts, but I'm still working on it (this was an explanation ;-).

In the mean while, something different. One of the most effective stress relieving techniques is just a good laugh.  As a software developer, there's this one site which is guaranteed to make me laugh. Better than reruns of Seinfeld. It's called Worse Than Failure (or WTF, yeah).

They have all kind of computer programs mishaps. It mainly falls into these categories (although there are some others as well):

  • CodeSOD - interesting and usually incorrect code snippets taken from actual production code in a commercial and/or open source software projects.
  • Tales from the Interview - Job interview stories.
  • Error'd - fun error messages and other visual oddities from the world of IT.

The last one is a true jewel. It features UI bloopers that never cease to amaze me. Goofy dialog boxes with cryptic error messages, web sites with overflown numbers and what not. I highly recommend adding it to your RSS reading list for those moments you just need a break.