Dec 29, 2011

On The Trail of a JVM Bug

I recently had the opportunity to track a bug past my application code into the JVM and beyond. I learned a few things along the way.

I found a bug in an application I was working on that I narrowed down to a concurrency issue. Threads were deadlocked somewhere. I couldn’t figure it out in the application so I wrote the most minimal app I could that reproduced the problem. Here’s that code (I removed various System.out… calls for brevity):

public class Foo {

	private int counter;

	public static void main(String[] args) throws Exception {
		 new Foo().run();
	}

	public void run() throws Exception {

		Thread[] threads = new Thread[2];

		for (int i = 0; i < threads.length; i++) {
			threads[i] = new Thread() {
				public void run() {
					for (int i = 0; i < 1000000000; i++) {
						increment();
					}
				}
			};
		}

		for (Thread t : threads) {
			t.start();
		}

		for (Thread t : threads) {
			t.join();
		}
	}

	private synchronized void increment() {
		counter++;
	}
}

The code above deadlocks both created threads in the counting loop; zero CPU usage. I didn’t see any reason that should happen so I sent the code sample to some coworkers. They didn’t see a problem either. At this point I started trying to isolate environmental issues. I was running this hardware, OS, and JVM:

        MacBook Pro
        2.3GHz Intel Core i7
        8GB 1333MHz DDR3

        OS X 10.6.8 with 64 bit enabled

        Java version 1.6.0_29
        Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-10M3527)
        Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)

The deadlock didn’t happen if I used a Lock instead of a synchronized method. It also didn’t happen if I ran the JVM in 32 bit mode, only in 64 bit mode.

I started considering the possibility of a JVM issue. That’s not a conclusion to jump to lightly so I sent the problem to a prominent Java expert who in turn involved two others. I haven’t their permission to give their names. One of them was able to reproduce the issue running similar hardware.

Next we tried running on 2 other JVMs: openjdk version 1.7.0-ea and soylatte java version 1.6.0_03-p3. There was no deadlock on either of these JVMs. It was starting to look a lot like a JVM problem.

Once we determined it wasn’t a problem with the application code we ran the app until it deadlocked then forced a core dump. To force a core dump from the command line:

  1. ulimit -c unlimited – set this to allow core dumps
  2. kill -SIGSEGV pid – send a segmentation fault signal to force the app to crash

The core file is saved in /cores/core.pid. There is also a *.crash file saved in [user home]/Library/Logs/DiagnosticReports.

The crash file analysis showed this info for the 2 threads:

Thread 24:  Java: Thread-1
0   libSystem.B.dylib             0x00007fff89f88d7a mach_msg_trap + 10
1   libSystem.B.dylib             0x00007fff89f893ed mach_msg + 59
2   libclient64.dylib             0x000000010100d903 jio_snprintf + 37641
...

Thread 25:  Java: Thread-2
0   libSystem.B.dylib             0x00007fff89f88d7a mach_msg_trap + 10
1   libSystem.B.dylib             0x00007fff89f893ed mach_msg + 59
2   libclient64.dylib             0x000000010100db03 jio_snprintf + 38153
...

They are both waiting on a message from the kernel. It appears to be an issue with the kernel not delivering the message instead of a JVM problem but we aren’t ready to declare any definite answer yet. The next step is the core dump analysis. Wherever this leads or whether we definitively identify the problem, its been an interesting experience.

About the Author

Object Partners profile.

One thought on “On The Trail of a JVM Bug

  1. KIrk says:

    It’s been my experience that sometimes, with non-Java applications, I had things lock up in what appears to be a missed message. You might want to send this to Apple I’m sure the the Mac-Port guys would be very interested in it. (macosx-port-dev-request@openjdk.java.net)

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
Simple improvements to making decisions in teams
Software development teams need to make a lot of decisions. Functional requirements, non-functional requirements, user experience, API contracts, tech stack, architecture, database schemas, cloud providers, deployment strategy, test strategy, security, and the list goes on. […]
JavaScript Bundle Optimization – Polyfills
If you are lucky enough to only support a small subset of browsers (for example, you are targeting a controlled set of users), feel free to move along. However, if your website is open to […]
Creating Mocks For Unit Testing in Go
Unit testing is an important part of any project, and Go built its framework with a testing package; making unit testing part of the language. This testing framework is good for most scenarios, but you […]
Resetting Database Between Spring Integration Tests
When tasked with having to write an integration test or a Spring Webflux test that uses a database, it can be cumbersome to have to reset the database between each test by using @DirtiesContext. Using […]