Serialization - 2020
Imagine you have a program which takes more than one session to complete. But we don't want to start the program from the first session next time but start it from where it stopped. So, we need to save the state of our objects.
If Java didn't have Serialization we'd have to use one of the I/O classes to write the state of the instance variables of all the objects we want to save. The worst part would be we could end up setting variables with the wrong values during the reconstruction process.
Serialization lets us save objects and instance variables of each object. Actually, we can save those variables selectively using a keyword transient.
Basic serialization id done by just two methods:
- ObjectOutputStream.writeObject()
Serialize objects and write them to a stream. - ObjectInputStream.writeObject()
Read the stream and deserialize objects.
The java.io.ObjectOutputStream and java.io.ObjectInputStream classes are considered to be higher-level classes in the package. So, we can wrap them around lower-level classes such as java.io.FileOutputStream and java.io.FileInputStream.
The following code is a bare-bones example which creates an object, serializes it, and deserializes it:
import java.io.*; class WarCraft implements Serializable {} (1) public class SerializeWarCraft { public static void main(String[] args) { WarCraft wc = new WarCraft(); (2) try { FileOutputStream fos = new FileOutputStream("Testing Serial"); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(wc); (3) oos.close(); } catch (Exception e) { e.printStackTrace(); } try { FileInputStream fis = new FileInputStream("Testing Serial"); ObjectInputStream ois = new ObjectInputStream(fis); wc = (WarCraft) ois.readObject(); (4) ois.close(); } catch (Exception e) { e.printStackTrace(); } } }
Let's look at the code in detail.
- We declare that the WarCraft class implements the Serializable interface. Serializable is a marker interface; it has no methods to implement.
- We make a new WarCraft object which is serializable.
- We serialize the WarCraft object wc by invoking the writeObject() method. Before we do it, we had to the following:
- We had to put all of our I/O code in a try/catch block.
- We had to create a FileOutputStream to write the object to.
- We wrapped the FileOutputStream in an ObjectOutputStream, which is the class that has the serialization method that we need.
- It serializes the object.
- Then, it writes the serialized object to a file.
- We de-serialize the WarCraft object by invoking readObject() method. The readObject() method returns an Object, so we have to cast the deserialized object back to a WarCraft.
Object Graph is a view of an object system at a particular point in time. Objects are linked to each other by one object either owning or containing another object or holding a reference to another object. This web of objects is called an object graph and it is the more abstract structure that can be used in discussing an application's state.
So, what does it really mean by saving an object?
If the instance variables are all primitive types, saving an object is a pretty much straightforward job. But there might be a case when the instance variables are themselves references to other objects. What should be saved?
In java, it wouldn't make any sense to save the actual value of a reference variable. That's because the value of a Java reference is meaningful only within the context of a single instance of a JVM. So, if we tried to restore the object in another instance of the JVM, even running on the same computer on which the object was originally serialized, the reference would be useless.
But what about the object that the reference refers to?
Let's look at this class:
import java.io.*; class Sword { private Grip theGrip; private int swordSize; public Sword(Grip grip, int size) { theGrip = grip; swordSize = size; } public Grip getGrip() { return theGrip; } } class Grip { private int gripSize; public Grip(int size) {gripSize = size;} public int getGripSize() { return gripSize; } } public class NormalSword { public static void main(String[] args) { Grip g = new Grip(10); Sword sw = new Sword(g,100); } }
Now what happens if we save the Sword? If the goal is to save and then restore a Sword, and the restored Sword is an exact duplicate of the Sword that was saved, then the Sword needs a Grip that is an exact duplicate of the Sword's Grip at the time the Sword was saved. That means both the Sword and the Grip should be saved.
And what if the Grip itself had references to other objects. This gets quite complicated. It would be formidable task for a programmer to restore object graphs.
Fortunately, the Java serialization takes care of all the jobs for us. When we serialize an object, Java serialization takes care of saving that object's entire object graph. That means a deep copy of every object saved needs to be restored. If we serialize a Sword object, the Grid will be serialized automatically. And if the Grid class contained a reference to another object, that object would also be serialized, and so on. And the only object we have to worry about saving and restoring is the Sword, The other objects required to fully reconstruct that Sword are saved and restored automatically through serialization.
Keep in mind, we should make a conscious choice to create objects that are serializable by implementing the Serializable interface. So, if we want to save Sword objects, for example, we'll have to modify the Sword class as follows:
import java.io.*; class Sword implements Serializable { private Grip theGrip; private int swordSize; public Sword(Grip grip, int size) { theGrip = grip; swordSize = size; } public Grip getGrip() { return theGrip; } } class Grip { private int gripSize; public Grip(int size) {gripSize = size;} public int getGripSize() { return gripSize; } } public class SerializeSword { public static void main(String[] args) { Grip g = new Grip(10); Sword sw = new Sword(g,100); try { FileOutputStream fos = new FileOutputStream("test.ser"); ObjectOutputStream oos = new ObjectOutputStream (fos); oos.writeObject(sw); oos.close(); } catch (Exception e) { e.printStackTrace(); } } }
We're trying to save the Sword, but when we run the code we get a runtime exception.
java.io.NotSerializableException: Grip
We forgot the Grip class should have implemented Serializable. So, after making the Grip, the code should look like this:
import java.io.*; class Sword implements Serializable { private Grip theGrip; private int swordSize; public Sword(Grip grip, int size) { theGrip = grip; swordSize = size; } public Grip getGrip() { return theGrip; } } class Grip implements Serializable { private int gripSize; public Grip(int size) {gripSize = size;} public int getGripSize() { return gripSize; } } public class SerializeSword { public static void main(String[] args) { Grip g = new Grip(10); Sword sw = new Sword(g,100); System.out.println("before: grip size is " + sw.getGrip().getGripSize()); try { FileOutputStream fos = new FileOutputStream("test.ser"); ObjectOutputStream oos = new ObjectOutputStream (fos); oos.writeObject(sw); oos.close(); } catch (Exception e) { e.printStackTrace(); } System.out.println("after: grip size is " + sw.getGrip().getGripSize()); } }
Output from the run is:
before: grip size is 10 after: grip size is 10
But what would happen if we didn't have access to the Grip class source code?
In other words, what if we couldn't make the Grip class serializable?
That's where the transient modifier comes in. If we mark (or tag) the Sword's Grip instance variable with transient, then the serialization will skip the Grip during serialization.
In other words, transient says, "don't save this variable during serialization, just skip it."
So, why would a variable not be serializable?
It could be that the class designer simply forgot to make the class implement Serializable. Or it might be because the object relies on runtime-specific information that simply can't be saved.
Although most things in the Java class libraries are serializable, we can't save things like network connections, threads, or file object. They're all dependent on and specific to a particular runtime experience. In other words, they're instantiated in a way that's unique to a particular run of the program, on a particular platform, in a particular JVM. Once the program shuts down, there's no way to bring those things back to life in any meaningful way; they have to be created from scratch each time.
In the previous example, we have a Sword object we want to save. The Sword has a Grip, and the Grip has state that should also be saved as part of the Sword's state. But the Grip is not Serializable, so we must mark it transient. That means when the Sword is deserialized, it comes back with a null Grip. What can we do to make sure when the Sword is deserialized, it gets a new Grip that matches the one the Sword had when the Sword was saved?
For whatever reason, we can't serialize a Grip object, but we want to serialize a Sword. To do this, we're going to implement writeObject() and readObject(). By implementing these two methods, we're saying to the compiler: "If anyone invokes writeObject() or readObject() concerning a Sword object, use this code as part of the read and write."
import java.io.*; class Sword implements Serializable { transient private Grip theGrip; private int swordSize; public Sword(Grip grip, int size) { theGrip = grip; swordSize = size; } public Grip getGrip() { return theGrip; } private void writeObject(ObjectOutputStream os) { try { os.defaultWriteObject(); os.writeInt(theGrip.getGripSize()); } catch (Exception e) { e.printStackTrace(); } } private void readObject(ObjectInputStream is) { try { is.defaultReadObject(); theGrip = new Grip(is.readInt()); } catch (Exception e) { e.printStackTrace(); } } } class Grip { private int gripSize; public Grip(int size) {gripSize = size;} public int getGripSize() { return gripSize; } } public class SerializeSword { public static void main(String[] args) { Grip g = new Grip(10); Sword sw = new Sword(g,100); System.out.println("saved: grip size is " + sw.getGrip().getGripSize()); try { FileOutputStream fos = new FileOutputStream("test.ser"); ObjectOutputStream oos = new ObjectOutputStream (fos); oos.writeObject(sw); oos.close(); } catch (Exception e) { e.printStackTrace(); } System.out.println("restored: grip size is " + sw.getGrip().getGripSize()); } }
Output is:
saved: grip size is 10 restored: grip size is 10
Let's look at the code in detail:
- writeObject() can throw exceptions.
- When we invoke defaultWriteObject() from within writeObject(), we're telling the JVM to do the normal serialization process for this object. When implementing writeObject(), we'll typically request the normal serialization process, and do some custom writing and reading too.
- In this case, we write an extra int to the stream that's creating the serialized Sword.
- When it's time to deserialize, defaultReadObject() handles the normal deserialization we'd get if we didn't implement a readObject() method.
- We build a new Grid object for the Sword using the grip size that we manually serialized. We had to invoke readInt() after we invoked defaultReadObject().
- Make a FileOutputStream
FileOutputStream fs = new FileOutputStream("mine.ser");
We make a FileOutputStream object.
FileOutputStream knows how to connect to (and create) a file. - Make an ObjectOutputStream
ObjectOutputStream os = new ObjectOutputStream(fs);
ObjectOutputStream lets us write objects, but it can't directly connect to a file. It needs to be fed a "helper". This is actually called "chaining" one stream to another. - Write the object
os.writeObject(ref1); os.writeObject(ref2; os.writeObject(ref3);
Serializes the object referenced by ref1, ref2, and ref3, and writes them to the file "mine.ser". - Close the ObjectOutputStream
os.close();
Closing the stream at the top closes the ones underneath, so the FileOutputStream (and the file) will close automatically.
- Make a FileInputStream
FileInputStream fs = new FileInputStream("mine.ser");
We make a FileInputStream object.
FileInputStream knows how to connect to an existing file.
If the file "mine.ser" doesn't exist, we'll get an exception. - Make an ObjectInputStream
ObjectInputStream os = new ObjectInputStream(fs);
ObjectInputStream lets us read objects, but it can't directly connect to a file. It needs to be chain to a connection stream, in this case a FileInputStream. - Read the objects
Object ref1 = os.readObject(); Object ref2 = os.readObject(); Object ref3 = os.readObject();
Each time we say readObject(), we get the next object in the stream. So, we'll read them back in the same order in which they were written. - Cast the objects
MyClass v1 = (MyClass) ref1; MyClass v2 = (MyClass) ref2; MyClass v3 = (MyClass) ref3;
The return value of readObject() is type Object, so we have to cast it back to the type we know it really is. - Close the ObjectOutputStream
os.close();
Closing the stream at the top closes the ones underneath, so the FileInputStream (and the file) will close automatically.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization