Breaking News

Introduction to XML Serialization in .NET


XML. We've all seen it one form or another:  XHTML displayed in a browser, simple database files, even the configuration files used to setup our .NET applications. XML is an interesting, and sometimes painful, markup technology. This article intends to demonstrate how to use XML to store our in-memory objects to XML files or read such files in to recreate these objects in-memory. This is meant to be a beginner's tutorial.

Before we get in to the actual code, let's first examine what serialization is. How about an example?

Let's say you are going to the grocery store. You have brought your grocery list with you. This list represents your Groceries object. Now, your Groceries object is scattered throughout the grocery store. We need a way for you to leave the store with your Groceries object. Obviously, you don't want to carry (or purchase for that matter) all the groceries in the store. You are only interested in the groceries on your list. So you grab a cart and begin to walk the isles, putting items from your list into the cart as you come across them. Once you have crossed off all the items from your list (i.e. all items are in your cart), you pay for them and take your groceries home.

For this example, Groceries represents your in-memory object. The list, for our purposes here, would be equatable to the object's members. The grocery store, if you haven't guessed, is our application's memory. The act of you walking the isles and grabbing all the items from your list and placing them in your cart is serialization (and of course, the act of you forgetting your wallet and the bag boy having to restock those items is deserialization!).

Hopefully, this example gives you an idea of what we intend to do--combine all the parts of our object into one container. If it doesn't, perhaps this example will enlighten you.

Let's assume your application's memory is laid out like a freight train pulling box cars, in a straight line. Let's also say you and your family (Pa, Ma and Tiny Tim) are a group of nomad's. Now you can run very fast (because you were on the high school track team before you became a Nomad). You jump into the first box car and wait for your family. But Tiny Tim can't run that fast because he has stubby legs (but you love him, because he's kin). Ma is not going to leave Tim to fend for himself, so she stays with him and they catch one of the cars near the caboose. Now you and your family are all on the same train, but if we started at your car and proceeded to walk the roofs back toward Ma and Tim, in between, there would be other cars containing other nomads. This is how our objects are laid out in memory with .NET. To serialize an object, we must lay it out sequentially (remove distance between you and Ma & Tim).

This is not to imply that we are actually shifting the members of our object around in memory. The idea of "laying out objects sequentially" might better describe Binary serialization, but as the term "serialization" implies putting things in some kind of sequential order, that implication is maintained here.

Serialization

Serialization turns out to be extremely easy in .NET. We do not have to actually write code to lay our object out sequentially--.NET handles this for us. All of the classes we need for serialization/deserializaton (not including our own classes) can be found under theSystem.Xml.Serialization namespace. As it turns out, for simple serialization, we only need the XmlSerializer class from within this namespace.

Here's a simple example passing "This is a test" for data and "test.xml" for filename:

C# ]
static void SerializeIt(string data, string filename)
{
    XmlSerializer serializer = new XmlSerializer(data.GetType());

    using (XmlTextWriter writer = new XmlTextWriter(filename, Encoding.UTF8))
    {
        try
        {
            serializer.Serialize(writer, data);
            Console.WriteLine("Serialization successful! Wrote test.dat!");
        }
        catch (InvalidOperationException)
        {
            Console.WriteLine("Failed to serialize object!!");
        }
    }
}
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:


VB ]
Sub SerializeIt(ByVal data As String, ByVal filename As String)
    Dim serializer As New XmlSerializer(GetType(String))

    Using writer As New XmlTextWriter(filename, Encoding.UTF8)
        Try
            serializer.Serialize(writer, data)
            Console.WriteLine("Serialization successful! Wrote test.dat!")
        Catch ex As Exception
            Console.WriteLine("Failed to serialize object!!")
        End Try
    End Using

End Sub
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:


test.xml ]
This is a test
1:

As you can see in our result, the XmlSerializer and XmlTextWriter objects take care of formatting our output as valid XML. Our XML declaration is already included for us and the data of the object we serialized is clearly visible. Now as I said, this example is simple and it doesn't accurately portray the idea of gathering an object into one container. It does, however, give you an idea of the classes involved in XML serialization via .NET.

With this example, the XmlSerializer object does the work laying our object out sequentially. The XmlTextWriter does the job of outputting the object to the file test.xml. This is not to imply that XmlTextWriter is the only writer/stream object you can pass to the constructor of XmlSerializer--it's constructor is overloaded. I leave it to you to experiment with different output streams.

When creating our XmlSerializer object, the XmlSerializer's constructor needs to be informed of the type of object we are going to be serializing--we'll have a bit of fun with that later. This type will basically correlate to the root node of our new XML document. As you can see in test.xml, we passed the XmlSerializer a type of string and it created a root node called string.

Deserialization

So we now have an XML document representing our object from memory. Great! What do we do with that? Well, there are plenty of things you can do with these new files--archive data from an execution of an application, store "session data", special-purpose configuration settings (maybe), display data on a web page, import data to a database, etc., etc. However, if you are going to use these objects again in your application, how in the heck do you get that data back into the app? Answer:  deserialize the file.

Deserialization is the opposite of serialization (no kidding?). With deserialization, you take the file which you wrote out at some previous time and restore it to memory as the type of object from which the data in the file came. For our example above, this means we would take test.xml and turn it back into a string object which has the value "This is a test". Please don't scroll past the next example--I know it's long, but it's so very important.

C# ]
static string DeserializeIt(string filenme)
{
    XmlSerializer serializer = new XmlSerializer(typeof(string));
    string result = string.Empty;

    using (XmlTextReader reader = new XmlTextReader(filenme))
    {
        try
        {
            result = serializer.Deserialize(reader).ToString();
            Console.WriteLine("Deserialization successful! Got string:  {0}", result);
        }
        catch (InvalidOperationException)
        {
            Console.WriteLine("Failed to deserialize object!!");
        }
    }

    return result;
}
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:


VB ]
Function DeserializeIt(ByVal filename As String) As String
    Dim serializer As New XmlSerializer(GetType(String))
    Dim result As String = String.Empty

    Using reader As New XmlTextReader(filename)
        Try
            result = DirectCast(serializer.Deserialize(reader), String)
            Console.WriteLine("Deserialization successful! Got string:  {0}", result)
        Catch ex As Exception
            Console.WriteLine("Failed to deserialize object!!")
        End Try
    End Using

    Return result
End Function
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:


Returns ]
This is a test
1:

I apologize. I know that was a lot to read! You see that we really only changed the function called and captured its return value. You will notice that type of the object returned byDeserialize is of type object. Because of this, a cast will be required. For our examples, since we merely serialized/deserialized a string, we can simply call the ToString method on the object returned from Deserialize. Had we completed this process using some other object, then a cast would be required to store the result into the appropriate type of variable.


More Fun
I mentioned above that we would have a bit of fun in playing with the types passed to the XmlSerializer's constructor. First, a bit of thought.

Our example above was pretty neat in that would write out an object to an XML file for us in a couple of lines of code. However, what if we have several different types of objects which we want to serialize/deserialize? Do we really want to rewrite this code for each object type? I certainly don't. This is where Generics come in. Please note: this is not intended to be an article on Generics. If you do not understand or are not comfortable with generic classes/functions, please skip this section of the article.

Let's see if we can modify the original example using Generics to allow for some flexibility:

C# ]
static void SerializeIt(T data, string filename)
{
    XmlSerializer serializer = new XmlSerializer(data.GetType());

    using (XmlTextWriter writer = new XmlTextWriter(filename, Encoding.UTF8))
    {
        try
        {
            serializer.Serialize(writer, data);
            Console.WriteLine("Serialization successful! Wrote test.dat!");
        }
        catch (InvalidOperationException)
        {
            Console.WriteLine("Failed to serialize object!!");
        }
    }
}

static T DeserializeIt(string filenme)
    where T: class
{
    XmlSerializer serializer = new XmlSerializer(typeof(T));
    T result = null;

    using (XmlTextReader reader = new XmlTextReader(filenme))
    {
        try
        {
            result = (T)serializer.Deserialize(reader);
            Console.WriteLine("Deserialization successful! Got string:  {0}", result);
        }
        catch (InvalidOperationException)
        {
            Console.WriteLine("Failed to deserialize object!!");
        }
    }

    return result;
}
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:


VB ]
Sub SerializeIt(Of T)(ByVal data As T, ByVal filename As String)
    Dim serializer As New XmlSerializer(GetType(T))

    Using writer As New XmlTextWriter(filename, Encoding.UTF8)
        Try
            serializer.Serialize(writer, data)
            Console.WriteLine("Serialization successful! Wrote test.dat!")
        Catch ex As Exception
            Console.WriteLine("Failed to serialize object!!")
        End Try
    End Using

End Sub

Function DeserializeIt(Of T)(ByVal filename As String) As T
    Dim serializer As New XmlSerializer(GetType(T))
    Dim result As T = Nothing

    Using reader As New XmlTextReader(filename)
        Try
            result = DirectCast(serializer.Deserialize(reader), T)
            Console.WriteLine("Deserialization successful! Got string:  {0}", result)
        Catch ex As Exception
            Console.WriteLine("Failed to deserialize object!!")
        End Try
    End Using

    Return result
End Function
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:

As demonstrated above, we have opened up our serialization/deserialization functions to allow for multiple types to be passed into them. Now we have two functions to maintain as opposed to writing new functions for each object type. Again, because Deserialize returns its data as type object, we must perform a cast to assign to a variable of the proper type.


Issues to be Aware of
  • You must keep track of the type you are deserializing a file to. There is no inherent mechanism to decipher what type of object an XML file contains. I plan to continue this article on a separate thread to go into details regarding implementing serialization in custom classes.
  • There are many different attributes (which will not be discussed here) one can apply to classes when dealing with XML serialization. Some class designers have marked their classes as NonSerializable; as such, there will be some classes which cannot be serialized.
  • Again, this article is intended as an introduction to serialization/deserialization. It des not specifically address issues related to versioning, code security or custom classes.

Notes
  • I used two different methods to get type information in the examples above. For C#, I used the GetTypeinstance method available to all objects. For VB, I used the GetType operator. This was not to imply that retrieval of type information must be done in these specific manners. You could just as easily use thetypeof operator in C# and the GetType instance method in VB.
  • For the last examples,  the where clause is C# specific and was included to allow us to give the return variable an initial value of null. As currently written, passing a value type to the Deserialize function in the C# version would not be permitted. You could, however, bypass this by declaring your value type as "nullable" by using the syntax:  Nullable or ValueTypeHere?.
  • My methods are marked as static. This is only because I was testing code inside of a console application and it was easier for me to do it that way. The use of static here is not meant to imply that serialization relies on functions being marked static. You can give your methods any access level you like.

No comments