Sunday, 1 July 2012

Externalization in java

Externalization is nothing but Serialization.Serialization is the process of saving an object state in a storage medium (such as a file, or a memory buffer) or to transmit it over a network connection in binary form.The object can be restored (Deserialization) at a later time, and even a later location. With persistence, we can move an object from one computer to another, and have it maintain its state.Serialization can be achieved by an object by implementing Serializable interface or Externalizable interface.

The default serialization mechanism by implemeting Serializable interface, here you dont have to do much,just implement Seralizable marker interface in your class and invoke readObject and writeObject methods of DataInputStream and DataOutputStream for seralization and deseralization.If you have such an easy way for seralization, why should you go for externalization?.

why should you go for externalization?


Serializing by implementing Serializable interface has some issues.

Apart from the fields that are required,serialization transform a graph of Java objects into an array of bytes for storage or transmission.The graph of objects means starting from a single object, until all the objects that can be reached from that object by following instance variables, are also serialized.This includes the super class of the object until it reaches the "Object" class and the same way the super class of the instance variables until it reaches the "Object" class of those variables. Basically all the objects that it can read.This leads to lot of overheads.You can partially solve this issue by declaring the fields which you dont want to serialize as 'transient' but this solution is not feasible.Suppose if you want to decide the fields which you dont want to serialize at runtime based on some conditions, default serilaization mechanism(using Serializable interface) can't use at this time,implementing Externalizable interface will probably be a better solution.


If you don't want to store the state of Super classes(which are automatically maintained by the Serializable interface implementation) it is better to use Externalizable interface.

Using the default serialization mechanism,deserialization process requires lots of metadata to discover the information about the serialized object,which includes the descption of all the serializable super classes, the description of the class and the instance data associated with the specific instance of the class.So all these information needs to be added to the stream during serialization.Lots of data and metadata and again performance issue.Serialization mechanism using Externalizable interface stores only the values of the fields during serialization.

You know that serialization needs serialVersionUID for the versioning of the serialized objects.If you don't explicitly set a serialiVersionUID, serialization run-time will compute the serialiVersionUID using some algorithms.This is a time consuming process.

Using the default serialization mechanism,deserialization process doen't invoke any of the constructors of the class,hence during deserialization you cant perform any initialization which is done in the constructor.Although there is an alternative of writing all initialization logic in a separate method and call it in constructor and readObject methods(override the readObject method) so that when an object is created or deserialized, the initialization process can happen but it definitely is a messy approach.The serialization mechanism using Externalizable interface invokes the public no-arg constructor during deserialization process.You no need to do any seperate coding for it.

Externalization


This is an alternative way for implementing the serialization mechanism.Instead of implementing the Serializable interface, you can implement Externalizable, which contains two methods:


public interface Externalizable
extends Serializable


writeExternal

public void writeExternal(ObjectOutput out)
                           throws IOException

The object implements the writeExternal method to save its contents by calling the methods of DataOutput for its primitive values or calling the writeObject method of ObjectOutput for objects, strings, and arrays.

readExternal

public void readExternal(ObjectInput in)
                  throws IOException,
                         ClassNotFoundException


The object implements the readExternal method to restore its contents by calling the methods of DataInput for primitive types and readObject for objects, strings and arrays. The readExternal method must read the values in the same sequence and with the same types as were written by writeExternal.

Just override those methods to provide your own protocol. Unlike the default serialization mechanism, nothing is provided for free here. That is, the protocol is entirely in your hands. Although it's the more difficult scenario, it's also the most controllable,which means you will have complete control of what to serialize and what not to serialize. But with default serialization mecahanism using Serializable interface,it transforms the entire graph of Java objects into an array of bytes for storage or transmission.This includes the super class of the object until it reaches the "Object" class and the same way the super class of the instance variables until it reaches the "Object" class of those variables.

How serialization works? 


JVM first checks for the Externalizable interface and if object supports Externalizable interface, then serializes the object using writeExternal method. If the object does not support Externalizable but implement Serializable, then the object is saved using ObjectOutputStream. Now when an Externalizable object is reconstructed, an instance is created first using the public no-arg constructor, then the readExternal method is called. Again if the object does not support Externalizable, then Serializable objects are restored by reading them from an ObjectInputStream.

Please visit Externalization example in java for an example.

Drawbacks of Externalization


In the default serialization mechanism using Serializable interface,it will implicitly take care of the state of serializable super classes.But in the externalization, the readExternal and writeExternal methods of the class must explicitly coordinate with the supertype to save its state.Please visit Limitations of externalization for an example.

Whenever there is modification in the class definition you have to explicitly modify your readExternal and writeExternal methods to reflect that changes in the serialization process.But in the default serialization mecahanism using Serializable interfacethe serialization runtime will take care it implicitly.

To externalize an object, you need a default public constructor. Externalizable interface can't be implemented by Inner Classes in Java because inner classes can't have a no-arg public constructor.Hence Inner classes can achieve object serialization by only implementing Serializable interface.

No comments:

Post a Comment