Recently I needed to write a struct to disk (not an unusual requirement). For simplicity I chose to use BinaryFormatter. This structure's in-memory size was quite large - in the order of 10 Mb. I was very suprised to find the serialised result was more than 3 times larger, and that serialisation took a very long time. I traced the problem to an array of struct and wrote a simple program to calculate the serialisation overhead.
[Serializable] struct MyInt { int value; }
I used MyInt to represent a user defined struct that should be equivalent in size to System.Int32.
static void Main() { int[] tests = { 0, 1, 1000 }; foreach (int i in tests) { Test<int>(i); Test<MyInt>(i); } } static void Test<T>(int length) { T[] test = new T[length]; BinaryFormatter bf = new BinaryFormatter(); using (MemoryStream ms = new MemoryStream()) { bf.Serialize(ms, test); int expected = Marshal.SizeOf(typeof(T)) * length; Console.WriteLine("{0}[{1}] = {2} + {3}", typeof(T).Name, length, expected, ms.Position - expected); } }
This program is straightforward. We invoke the Test method several times for the types int and MyInt and lengths of 0, 1 and 1000. The test creates an array of the given type and given size, uses a BinaryFormatter to serialise to a MemoryStream, calculates the in-memory or stack size using Marshal.SizeOf, and outputs the in-memory size and serialisation overhead.
I found the results surprising.
Int32[0] = 0 + 28 MyInt[0] = 0 + 113 Int32[1] = 4 + 28 MyInt[1] = 4 + 140 Int32[1000] = 4000 + 28 MyInt[1000] = 4000 + 9131
As you can see, the overhead for the Int32 array is constant at 28 bytes, while the overhead for the MyInt array appears to grow dependent on the length of the array. For such a small structure, the overhead for an array of MyInt is massive.
What's really happening here? I don't know, but at a guess, the array of struct is being treated as an array of Object. Thus for each element, type information (maybe even object identity?) is being recorded.
