Thursday, May 10, 2012

Font Crashing / Terminating jvm in Linux with X11 forwarding

I ran into an issue recently where when a certain report was run on a specific server the jvm would disappear without a trace! There were no core dumps, no errors in logs just a 500 response from Apache  and the process was gone.

I struggled with this for a while as I could regularly reproduce the issue but only on one server. By pure co-incidence I could see the terminal when the application was started and there was a cryptic X11 error message:

X connection to localhost:10.0 broken (explicit kill or server shutdown)

This server also turned out to be one where we had to sudo su to a different user to start the application and had hacked around with the Magic Cookie for X11 forwarding. This mean that although I had X11 enabled on my session it was broken.

Removing the X11 forwarding from my ssh session fixed the issue. I never managed to work out if it was a bug in the JVM or if it was my broken X11 session sending a kill signal to the process,  full details of my digging over the jump.
I had some memory of font issues and X in the past so I hooked a debugger into the application and then then managed to narrow this down to one line in the Apache POI library which was trying to retrieve a font:

font.canDisplayUpTo(text, start, limit);

and when this code is executed the jvm disappears.

Based on this I could start to reproduce the issue, I started digging into the code in POI and made up my own class called FontCrasher which I could compile and run from the command line. With this I could start to reproduce the issue repeatedly but I was not able to make the JVM produce any core dumps.

I was developing a suspiscion at this point that X was sending a kill signal when the font lookup failed so I started poking around signal handling in Java. This is unsupported, implementation dependant and probably not recommended for you to try at home! Based on a bit of googling and the examples from http://twit88.com/blog/2008/02/06/java-signal-handling/ and http://www.dclausen.net/javahacks/signal.html  I put together a buggy version of the FontCrasher that also intercepted the Signals from the OS. This worked but did not get me any further as I never got any useful signals out of it.

At this point I had to stop and have not been able to decide whether it is X or a JVM bug that is killing my JVM. Given that the root cause of the issue is my hacking around with the X Magic Cookie I figured it would be a bit rich to try to log this as a bug.

The code for FontCrasher and the version with signal handling (FontCrasherWithSigs) is attached. Note that the version that handles signals is actually very buggy and you will need to kill -9 it as it has an endless loop in the kill signal handling!

To reproduce the issue you will need to do the following:
ssh -X user1@server
cp .Xauthority to /home/user2 (this breaks the X forwarding for user 3)
sudo su user3
javac FontCrasher.java
java FontCrasher

2 comments:

  1. I was using POI to export an excel spread sheet. Everything was working until I switch Java implementation from SUN Java to OpenJDK. Now, whenever I click on "Export" button on my web app, it invokes Apache POI which kills JVM and tomcat server.. Is there workaround for this issue?

    ReplyDelete
    Replies
    1. I am not familiar with this issue sorry, the problem I ran into was due to fonts and the X server.

      What is the call that your code fails on? Is it something to do with styling a cell? Are you running this on Linux or something else?

      I found that when I styled a cell it called the font libraries which then caused the crash.

      Delete