Wednesday, July 9, 2008

Loading, Linking, & Initialization of Types in Java

Loading, Linking, & Initialization of Types in Java

The Java Virtual Machine makes the Types (we'll discuss here the user defined Types - classes and interfaces... built-in types also undergo similar set of phases and they are normally loaded as part of the JVM process start-up) available to a Java program under execution and any Type undergoes the following phases during its entire life cycle:-

  • Loading - this is the process of loacting the bytecodes (.class file) of the correponding Type and bringing that into the JVM memory.
  • Linking - this is the process of incorporating the loaded bytecodes into the Java Runtime System so that the loaded Type can be used by the JVM.
  • Initialization - this is the process of executing the static initializers of the loaded and linked Type.
  • Unloading - this is the process of reclaiming the memory occupied by the Loaded, Linked, and Initialized Type. This happens when there are no references of that Type and hence the Type becomes eligible for garbage collection. It's important to note that garbage collection of objects is different from that of Types. An object of a Type is collected during the next garbage collection cycle soon after it becomes unreachable, but the Type is unloaded only when all the objects of that Type are already garbage collected and the Type doesn't have any other reference to it.

Creation of java.lang.Class instance of a Type

An instance of the class java.lang.Class is created for every Type loaded into the underlying JVM. This instance creation marks the end of the loading phase of the Type. If an application is distributed across multiple JVMs then for the same application one (or more) Types may be required to be loaded multiple times (one per JVM) as a JVM can't use a Type without the corresponding java.lang.Class instance. While loading a Type, the bytecodes (of the .class file) are first converted into a binary stream so that they can be brought into the JVM memory. Once the stream has been successfully brought into JVM memory, the format of the stream is checked to verify if it's as per the specifications or not. This step helps making the Java code very secure as the underlying Security Manager can simply raise an alarm if the loaded binary stream is not havng a valid structure (either tempered accidently or intentionally) and hence it can safely eliminate the possibility of running a malicious and dangerous code on the machine. This is not possible in other languages like C, where the compiled and linked .exe (on Windows) and .out (on Linux) files are directly executed by the underlying Operating Systems as they are composed of native instructions.

But, we know Verification is a part of the Linking phase, so what does the verification done in the Loading phase mean? How are the two verifications different? Well... verification is certainly a part of the Linking phase (it's the first step of the Linking phase), but implementations do use verifications of different type during various other stages as well. This is done to ensure the integrity of the running JVM. The specification clearly says that a loaded Type should not cause the JVM process to hang and if the implementations remove the verification part completely from the Loading phase then a malicious bytecode may cause the JVM to hang (or to perform something unpredictable) while creation of the java.lang.Class instance for the loaded Type.

This java.lang.Class instance is actually the complete representation of the loaded Type built using the implementation dependent internal data structures. This is subsequently used by the various steps of the Linking phase (and also during the Initialization phase). These two phases - Linking and Initialization don't deal with the raw byte stream loaded during the Loading phase.

java.lang.LinkingError - what does it signify & when is it thrown?

The Java Specifications doesn't enforce any restriction on the actual timing of the Loading & Linking phases of a Type (also not on the exact timing of the Unloading phase as that is decided by the garbage collection algorithm), but Initialization phase is required to be done only when the Type gets its first active usage. One thing is very obvious that these phases will be performed in this order only. How will a Type be linked before being loaded otherwise? Without linking how will the initializers be executed otherwise? Most of the implementations do the Loading (and also the Linking in many) in anticipation of the usage of Types and only Initialization is required to be done at first usage of those Types in such cases.

Any problem encountered while any of the three phases is captured by instances of LinkageError class or an appropriate sub class of this class. Most of the JVM implementations normally load the Types much earlier and delay the linking and initialization phase till the first use of the Type. Many others delay only the Initilization till the first use and they perform both Loading and Linking much before the first usage of the Type. Irrespective of whether the any or all the phases are completed just before first use or earlier, any error encountered during any of the phases (captured by the appropriate subclasses of the LinkageError claas) must be thrown only at the time when the Type's first use is encountered not before that. If a Type is never used actively then the captured error is never thrown and the program proceeds normally.

Thus we see that the loading, linking, and initialization of a Type must give an impression that they are done only when the Type is actually used even if Loading and/or Linking phases have already been completed in the particular JVM implementation.

java.lang.LinkageError simply indicates that the loaded class has encountered some problem during Loading, Linking or Initialization phase. The name is misnomer here and as it doesn't only indicate a problem encountered only during the Linking phase (It'll be clear in the next paragraph).

One possible scenario when this error java.lang.LinkageError (or a suitable subclass of this class) may occur:- Suppose we have two classes and one is dependent on the other. Now, after compilation of the dependent class if the other class is changed and compiled then the former class may throw LinkageError while linking - such a situation will actually throw an instance of the subclass of this class named IncompatibleClassChangeError.

java.lang.LinkageError has seven subclasses - all specifying a specific scenario indicating the kind of error encountered while Loading, Linking, or Initialization of a Type. These sub classes are:-

As you can easily notice that the LinkageError doesn't only specify a problem encountered only in the Linking phase, instead it has specific sub classes to capture the potential errors which may occur during any of the three phases - Loading, Linking, and Initialization.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


1 comment:

prashant said...

just one word ..awesome.. without your questions i feel my knowledge of java was incompleate..