August 16, 2006
03:48 PM

Recently I needed to write a struct to disk (not an unusual requirement). For simplicity I chose to use BinaryFormatter. This structure's in-memory size was quite large - in the order of 10 Mb. I was very suprised to find the serialised result was more than 3 times larger, and that serialisation took a very long time. I traced the problem to an array of struct and wrote a simple program to calculate the serialisation overhead.

[Serializable]
struct MyInt
{
    int value;
}

I used MyInt to represent a user defined struct that should be equivalent in size to System.Int32.

static void Main()
{
    int[] tests = { 0, 1, 1000 };

    foreach (int i in tests)
    {
        Test<int>(i);
        Test<MyInt>(i);
    }
}

static void Test<T>(int length)
{
    T[] test = new T[length];

    BinaryFormatter bf = new BinaryFormatter();
    using (MemoryStream ms = new MemoryStream())
    {
        bf.Serialize(ms, test);

        int expected = Marshal.SizeOf(typeof(T)) * length;

        Console.WriteLine("{0}[{1}] = {2} + {3}",
            typeof(T).Name,
            length,
            expected,
            ms.Position - expected);
    }
}

This program is straightforward. We invoke the Test method several times for the types int and MyInt and lengths of 0, 1 and 1000. The test creates an array of the given type and given size, uses a BinaryFormatter to serialise to a MemoryStream, calculates the in-memory or stack size using Marshal.SizeOf, and outputs the in-memory size and serialisation overhead.

I found the results surprising.

Int32[0] = 0 + 28
MyInt[0] = 0 + 113
Int32[1] = 4 + 28
MyInt[1] = 4 + 140
Int32[1000] = 4000 + 28
MyInt[1000] = 4000 + 9131

As you can see, the overhead for the Int32 array is constant at 28 bytes, while the overhead for the MyInt array appears to grow dependent on the length of the array. For such a small structure, the overhead for an array of MyInt is massive.

What's really happening here? I don't know, but at a guess, the array of struct is being treated as an array of Object. Thus for each element, type information (maybe even object identity?) is being recorded.

© Douglas Stockwell 2007
Creative Commons License Unless otherwise specified all "source code" examples are available for use under the Creative Commons Attribution-Noncommercial 3.0 License. Please contact me if you would like more flexible licensing terms.
Messenger Presence