Sometimes, in large enterprise apps, I really want to work the data in a more interactive and exploratory way – similar to the way SmallTalk or Lisp/Clojure programmers do. The last few apps I’ve worked on have involved complicated business rules – thousands of lines of Groovy or Java – and sometimes I wish I could see what the data looks like in the middle of all that processing.
When the number of objects is large and object hierarchy is complicated, using an interactive debugger in Eclipse doesn’t help much and parsing log files is even worse. That’s when I usually create a quick and dirty GUI to help during development. The apps tend to live on as a production support tool. Groovy’s SwingBuilder or GroovyFX have both worked out great for this purpose. However, I’ve found the Groovy Console is a the right tool for interactively working with a large collection of complicated objects.
For example, consider a system where a risk assessment is being performed on a large number individuals. The system computes a risk score for each person in the group using sets of rules. When the system is all done processing, the only piece of business data that drives further processing (e.g. invoicing) is the final score. It’s not hard to imagine that going back after the fact to determine how the final score was computed might be difficult. This is where being able to interactive work the data as it looked at certain points in the processing is useful.
For systems that involve complex processing like I’ve described, I will insert code to serialize the Java objects to a file at certain places in the processing – often right before or right after a component API is called. When there is a production support problem, I can read the exact state of the Java objects at the time there was a problem. It’s much more valuable than parsing log files. In my experience, recent versions of Java are fast enough that this doesn’t add much overhead in the production system to save off this data.
Below is an example of a simple utility that can be used serialize a collection of Java objects to a file. It’s written in Groovy, but it the only thing “Groovy” about it is the lack of the try/catch blocks; a person could write this in Java if Groovy isn’t available for your production system.
From the Groovy Console, it is easy to deserialize the file and start interacting with a collection of production objects just like you would with tables in SQL. The Groovy Console can be launched from the command line or directly from Eclipse, using the Groovy plugin. If you choose to run it from Eclipse, all the classes in your project classpath are automatically available to you.
Below is a screenshot of the console, with some simple code for deserializing some test data and then computing a simple average. This is something that you wouldn’t easily be able to do from parsing log files or a simple debugger.
In the systems where I’ve really needed this kind of tool, the amount of data has been quite large. The size of the serialized file has been several hundred megabytes large. Constantly deserializing the collection every time I want to execute a new snippet of code takes time and breaks the interactive flow. A solution around this is to write a simple script that will deserialize the data once, place the data into a Groovy Console binding, and then programmatically launch the Groovy Console, as shown below.
This allows you to quickly change the Groovy code without the overhead of deserializing every time. As you can see in the screenshot below, I didn’t have to do anything to explicitly deserialize the data in the Groovy console – the assessments collection is already there to use.
Nothing I’ve shown above is especially complicated to do. However, I’ve found that this approach of creating snapshots of the Java objects while a system is running and then loading them later to interactively work with them can be a very powerful. It’s a great way to quickly get to the bottom of production support problems in large enterprise systems where it’s hard to see what’s going on because of the sheer size of the business rules and the data.