Montag, 28. Oktober 2013

Java class loading anomaly

I learned about a (for me) initially rather unintuitive anomaly in the Java language today. Of course, this is not technically an anomaly but something well-defined in the JVMS. However, I was not aware of the class loading behavior described in this blog entry, despite having read the specification, which I decided this was worth sharing.

I stumbled onto this when I was curious about reasons why it is not allowed to use static fields referencing an enum for annotation values while it is allowed for any other value. It turns out that the Java compiler is not allowed to substitute enum fields at compile time while it can substitute such values for all other possible annotation members. But what does this mean in practice?

Let's look at this example class:

@MyAnnotation(HelloWorldHelper.VAL1)
class MyClass {
  public static void main(String[] args) {
    System.out.println(MyClass.class.getAnnotation(MyAnnotation.class).value());
    System.out.println(HelloWorldHelper.VAL2);
    System.out.println(HelloWorldHelper.class.getName());
    System.out.println(HelloWorldHelper.VAL3);
  }
}

with the following helper classes:

enum MyEnum {
  HELLO_WORLD_ENUM
}

@Retention(RetentionPolicy.RUNTIME)
@interface MyAnnotation {
  String value();
}

class HelloWorldHelper {
  public static final String VAL1 = "Hello world!";
  public static final String VAL2 = "Hello world again!";
  public static final MyEnum VAL3 = MyEnum.HELLO_WORLD_ENUM;
  static { System.out.println("Initialized class: HelloWorldHelper"); }
}

the output (for me first unexpectedly) returns as:

Hello world!
Hello world again!
HelloWorldHelper
Initialized class: HelloWorldHelper
HELLO_WORLD_ENUM

But why is this so? The Java compiler substitutes constant references to String values (this is also true for primitives) with a direct entry of the referenced String's value in the referencing class's constant pool. This also means that you could not load another class HelloWorldHelper at runtime and expect those values to be adjusted in MyClass. This adjustment would only happen for the MyEnum value which is as a matter of fact resolved at runtime (and therefore causes the HelloWorldHelper class to be loaded and initialized which can be observed by the execution of the static block). The motive for not allowing this anomaly for enums but for Strings might well be (of course, I can only guess) that the Java language specification treats strings differently than other object types such as the primitive wrapper types. Usually, copying an object reference would break Java's contract of object identity. Strings on the other side will still be identical even after they were technically duplicated due to Java's concept of pooling load-time strings. As mentioned before, primitives can also be copied into the referencing class since primitive types are implemented as value types in Java which do not know a concept of identity. However, the HelloWorldHelper class would be loaded when for example referencing a non-primitive Integer boxing type.

Interestingly enough does HelloWorldHelper.class.getName() does not require the HelloWorldHelper class to be initialized. When looking at the generated byte code, one can observe that the HelloWorldHelper class is actually referenced this time and will as a matter of fact be loaded into the JVM. However, JVMS §5.5 does not specify such a reflective access as a reason to initialize the class which is why the above output appears the way observed.