Tuesday, January 28, 2014

Introduction to PduSerializer

The PduSerializer is a fast and easy to use byte serializer/deserializer library which I have been working on with a few friends a few month ago. The purpose of it is to give you a simple way to define a specific byte protocol to be used between systems, without diving into implementation specifics.

Let's have a look at a little sample code.
First, we need to define a message type

    // Defines a new serializable message type
    [PduMessage]
    public struct Location
    {
        // Defines a message field in a custom order
        [Field(Position = 3)]
        public short Height { get; set; }
        [Field(Position = 1)]
        public double Lat { get; set; }
        [Field(Position = 2)]
        public double Lon { get; set; }
    }


The PduMessage attribute marks Location as a new message that can be (de)serialized by the PduSerializer engine. PduSerializer will only treat fields/properties that are marked with the Field attribute and order them according to the Position property.

The next step is to initialize the SerializationEngine who is in charge of executing the (de) serialization.

ISerializationEngine engine = PduSerializer.Configure()
                // Register the message type to the engine
                .AddType<Location>()
                // Seals the configurion store and return the serialization engine
                .CreateSerializationEngine(); 

The PduSerializer has a sweet fluent configuration API which initializes the different components required for the serialization process and analyzes the registered message types. The PduSerializer configuration comes also with a convention over configuration message registration method so you can easily add some new message types without adding additional code.

ISerializationEngine engine = PduSerializer.Configure()
                .AddTypesFromAssemblyOf<Location>()
                .CreateSerializationEngine();

 The configuration phase does all the heavy lifting, so in order to increase the performance of the serialization itself, you should only perform this action in the system startup.

Now, let's have a look at how we actually perform the serialization.

// Define the stream in which the serialized bytes would be written to
var stream = new MemoryStream();

// Serialize an instance of a Location to the stream
engine.Serialize(location, stream);

Easy, isn't it?

But what's wrong with the built-in Dotnet serializer?

The Dotnet framework supply two ways to accomplish serialization.
The first and easiest way is to simply defining a type using the SerializableAttribute and serialize it with a BinaryFormatter.

[Serializable]
public struct Location
{
    public short Height { get; set; }
    public double Lat { get; set; }
    public double Lon { get; set; }
}

var stream = new MemoryStream();
var formatter = new BinaryFormatter();

formatter.Serialize(stream, location);

The problem with this approach is that it uses reflection in the serialize execution in order to figure out the structure of the serialized type, moreover, the programmer has no control over the format of the serialized output. This approach is not suitable for defining a protocol, but is good for persisting a system state.

Another way to perform the task is by implementing the ISerializable interface.

[Serializable]
public struct Location : ISerializable
{
    public short Height { get; set; }
    public double Lat { get; set; }
    public double Lon { get; set; }

    // Special constructor (required by ISerializable) to control deserialization
    private Location(SerializationInfo info, StreamingContext context):this()
    {
        Lat = info.GetInt32("Lat");
        Lon = info.GetInt32("Lon");
        Height = info.GetInt16("Height");
    }
    public void GetObjectData(SerializationInfo info, StreamingContext context)
    {
        info.AddValue("Lat", Lat);
        info.AddValue("Lon", Lon);
        info.AddValue("Height", Height);
    }

}

This approach gives you a better control over the formatting process, however, it makes you work damn hard for it. Defining a protocol containing lots of types requires a lot of manual implementation specifics.

A Brief Summery for the PduSerializer Abilities

*Message Hierarchy (message containing a field of an another message type)
*Bit (de)serialization 
*Supports big\little endian  
*Supports enums
*Supports padded strings
*Supports lists with a constant or a variable size
Extendable by defining a custom serializer to a message type

No comments:

Post a Comment