Search code examples
c#.netgenericsmemory-managementvalue-type

Why does an empty struct in C# consume memory


I always understood structs (value types) contain exactly the number of bytes as defined in the fields of the structure... however, I did some tests and there seems to be an exception for the empty structs:

public class EmptyStructTest
{
    static void Main(string[] args)
    {
        FindMemoryLoad<FooStruct>((id) => new FooStruct());
        FindMemoryLoad<Bar<FooStruct>>((id) => new Bar<FooStruct>(id));
        FindMemoryLoad<Bar<int>>((id) => new Bar<int>(id));
        FindMemoryLoad<int>((id) => id);
        Console.ReadLine();
    }

    private static void FindMemoryLoad<T>(Func<int, T> creator) where T : new()
    {
        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();
        Thread.MemoryBarrier();
        long start = GC.GetTotalMemory(true);

        T[] ids = new T[10000];
        for (int i = 0; i < ids.Length; ++i)
        {
            ids[i] = creator(i);
        }

        long end = GC.GetTotalMemory(true);
        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();
        Thread.MemoryBarrier();

        Console.WriteLine("{0} {1}", ((double)end-start) / 10000.0, ids.Length);
    }

    public struct FooStruct { }

    public struct Bar<T> where T : struct
    {
        public Bar(int id) { value = id; thing = default(T); }

        public int value;
        public T thing;
    }
}

If you run the program, you'll find that en FooStruct which has obviously 0 bytes of data will consume 1 byte of memory. The reason this is a problem for me is that I want Bar<FooStruct> to consume exactly 4 bytes (because I'm going to allocate it a lot).

Why does it have this behavior and is there a way to fix this (e.g. is there a special thing that consumes 0 bytes-- I'm not looking for a redesign)?


Solution

  • Summary: An empty struct in .NET consumes 1 byte. You can think of this as packing, since the unnamed byte is only accessible via unsafe code.

    More information: if you do all your pointer arithmetic according to values reported by .NET, things work out consistently.

    The following example illustrates using adjacent 0-byte structures on the stack, but these observations obviously apply to arrays of 0-byte structures as well.

    struct z { };
    
    unsafe static void foo()
    {
        var z3 = default(z);
        bool _;
        long cb_pack, Δz, cb_raw;
        var z2 = default(z);    // (reversed since stack offsets are negative)
        var z1 = default(z);
        var z0 = default(z);
    
        // stack packing differs between x64 and x86
        cb_pack = (long)&z1 - (long)&z0; // --> 1 on x64, 4 on x86
    
        // pointer arithmetic should give packing in units of z-size
        Δz = &z1 - &z0; // --> 1 on x64, 4 on x86
    
        // if one asks for the value of such a 'z-size'...
        cb_raw = Marshal.SizeOf(typeof(z));     // --> 1
    
        // ...then the claim holds up:
        _ = cb_pack == Δz * cb_raw;     // --> true
    
        // so you cannot rely on special knowledge that cb_pack==0 or cb_raw==0
        _ = &z0 /* + 0 */ == &z1;   // --> false
        _ = &z0 /* + 0 + 0 */ == &z2;   // --> false
    
        // instead, the pointer arithmetic you meant was:
        _ = &z0 + cb_pack == &z1;   // --> true
        _ = &z0 + cb_pack + cb_pack == &z2; // --> true
    
        // array indexing also works using reported values
        _ = &(&z0)[Δz] == &z1;  // --> true
    
        // the default structure 'by-value' comparison asserts that
        // all z instances are (globally) equivalent...
        _ = EqualityComparer<z>.Default.Equals(z0, z1); // --> true
    
        // ...even when there are intervening non-z objects which
        // would prevent putative 'overlaying' of 0-sized structs:
        _ = EqualityComparer<z>.Default.Equals(z0, z3); // --> true
    
        // same result with boxing/unboxing
        _ = Object.Equals(z0, z3);  // -> true
    
        // this one is never true for boxed value types
        _ = Object.ReferenceEquals(z0, z0); // -> false
    }
    

    As I mentioned in a comment, @supercat got it right when he noted, "There probably wouldn't have been any problem with designing .NET to allow for zero-length structures from the beginning, but there could be some things that would break if it were to start doing so now."

    EDIT: If you need to programmatically distinguish between 0-byte vs. 1-byte value types, you can use the following:

    public static bool IsZeroSizeStruct(Type t)
    {
        return t.IsValueType && !t.IsPrimitive && 
               t.GetFields((BindingFlags)0x34).All(fi => IsZeroSizeStruct(fi.FieldType));
    }
    

    Note that this correctly identifies arbitrarily nested structs where the total size would be zero.

    [StructLayout(LayoutKind.Sequential)]
    struct z { };
    [StructLayout(LayoutKind.Sequential)]
    struct zz { public z _z, __z, ___z; };
    [StructLayout(LayoutKind.Sequential)]
    struct zzz { private zz _zz; };
    [StructLayout(LayoutKind.Sequential)]
    struct zzzi { public zzz _zzz; int _i; };
    
    /// ...
    
    c = Marshal.SizeOf(typeof(z));      // 1
    c = Marshal.SizeOf(typeof(zz));     // 3
    c = Marshal.SizeOf(typeof(zzz));    // 3
    c = Marshal.SizeOf(typeof(zzzi));   // 8
    
    _ = IsZeroSizeStruct(typeof(z));    // true
    _ = IsZeroSizeStruct(typeof(zz));   // true 
    _ = IsZeroSizeStruct(typeof(zzz));  // true
    _ = IsZeroSizeStruct(typeof(zzzi)); // false
    

    [edit: see comment] What's strange here is that, when nesting 0-byte structs, the single-byte minimum can accumulate (i.e. into 3 bytes for 'zz' and 'zzz') but then suddenly all of that chaff disappears as soon as a single "substantial" field is included.