Comparing Apache Avro vs Java Serialization

We will start with huge performance difference between Apache Avro vs Java Serializastion and compare both serialization processes.

Previous posts

1. Introduction

Apache Avro consumed 15-20 times less memory to store the serialized data. I created a class with three fields (two String and one enum) and serialized them with Avro and Java.

The memory used by Apache Avro is 14 bytes and Java used 231 bytes (length of byte[]).

2. Why Apache Avro is generating less bytes for serilization

Let's understand both Java and Avro serilization process to understand the reasoning behind Avro using less bytes.

2.1. Java Serialization

The default serialization mechanism for an object writes the class of the object, the class signature, and the values of all non-transient and non-static fields. References to other objects (except in transient or static fields) cause those objects to be written also. Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written.

2.2. Apache Avro

writes only the schema as String and data of class being serialized. There is no per field overhead of writing the class of the object, the class signature as in Java. Also, the fields are serialized in pre-determined order.

You can find the full Java example code used to compare serialization process on github.

Some observations

Apache Avro can't handle circular references and throw java.lang.StackOverflowError whereas Java's default serialization can handle it. (example code for Avro and example code for Java serialization) Another observation is that Avro have no direct way of defining inheritance in the Schema (Classes) but Java's default serialization support inheritance with its own constraints like super class either need to implements Serializable interface or have default no-args constructor accessible till top hierarchy, otherwise will throw java.io.NotSerializableException.



Tags: Apache AVRO, Apache AVRO serialization, Apache AVRO deserialization, Java Serialization, Java Deserialization, Apache Avro vs Java Serialization, Java

← Back home