Apache AVRO - Serialization and Deserialization

We will start with Apache Avro formats and dive into Avro schema example with explanation.

Previous posts

1. Apache Avro formats

"Apache Avro™ is a data serialization system." It supports two formats, JSON and Binary. We use DatumReader<T> and DatumWriter<T> for de-serialization and serialization of data respectively.

2. Example

In the previous post, we generated Avro classes. We are extending the same example code to show how it support serialization and de-serialization of the data.

2.1. JSON Format

// Point 1
Employee employee = Employee.newBuilder().setFirstName("Gaurav").setLastName("Mazra").setSex(SEX.MALE).build();

// Point 2
DatumWriter<Employee> employeeWriter = new SpecificDatumWriter<>(Employee.class);
byte[] data;
try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
// Point 3
Encoder jsonEncoder = EncoderFactory.get().jsonEncoder(Employee.getClassSchema(), baos);
// Point 4
employeeWriter.write(employee, jsonEncoder);
// Point 5
jsonEncoder.flush();
data = baos.toByteArray();
}

// serialized data
System.out.println(new String(data));

// Point 6
DatumReader<Employee> employeeReader = new SpecificDatumReader<>(Employee.class);

// Point 7
Decoder decoder = DecoderFactory.get().jsonDecoder(Employee.getClassSchema(), new String(data));

// Point 8
employee = employeeReader.read(null, decoder);
//data after deserialization
System.out.println(employee);

Explanation on the way :)

2.2. Binary format.

// Point 1
Employee employee = Employee.newBuilder().setFirstName("Gaurav").setLastName("Mazra").setSex(SEX.MALE).build();

// Point 2
DatumWriter<Employee> employeeWriter = new SpecificDatumWriter<>(Employee.class);
byte[] data;
try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
// Point 3
Encoder binaryEncoder = EncoderFactory.get().binaryEncoder(baos, null);
// Point 4
employeeWriter.write(employee, binaryEncoder);
// Point 5
binaryEncoder.flush();
data = baos.toByteArray();
}

// serialized data
System.out.println(data);

// Point 6
DatumReader<Employee> employeeReader = new SpecificDatumReader<>(Employee.class);
// Point 7
Decoder binaryDecoder = DecoderFactory.get().binaryDecoder(data, null);
// Point 8
employee = employeeReader.read(null, decoder);
//data after deserialization
System.out.println(employee);

All the example is same as the previous one except Point 3 and Point 7 where we are creating an object of BinaryEncoder and BinaryDecoder.

This is how to we can serialize and deserialize data with Apache Avro. I hope you found this article informative and useful. You can find the full example on github.



Tags: Apache AVRO, Apache AVRO serialization, Apache AVRO deserialization, Apache AVRO DatumReader, Apache AVRO DatumWriter, Apache AVRO SpecificDatumReader, Apache AVRO SpecificDatumWriter, AVRO schema

← Back home