Consider the following struct:
internal struct WriteBuffer64
{
private const int INLINE_BUFFER_SIZE = 64;
private unsafe fixed byte buffer[INLINE_BUFFER_SIZE];
private int used;
public unsafe Span<byte> GetWriteSpan(int toWrite)
{
if (used + toWrite > INLINE_BUFFER_SIZE)
throw new Exception();
int offset = used;
used += toWrite;
return MemoryMarshal.CreateSpan(ref buffer[offset], toWrite);
}
public unsafe ReadOnlySpan<byte> AsSpan()
{
return MemoryMarshal.CreateSpan(ref buffer[0], used);
}
}
I'm wondering whether this is safe to do. Instances of WriteBuffer
may be located on the heap when boxed or when used in class types. So, I assume a native byte*
pointer (pointing to the fixed buffer) may not be safe to use.
But how about Span
? Is it capable to interpret this as a reference to a managed object? More generally, how does the magic with references to managed objects in Spans work?
To be more specific:
byte[] utf8 = Encoding.UTF8.GetBytes("Hello World!");
WriteBuffer64 buffer = new WriteBuffer64();
BinaryPrimitives.WriteInt32BigEndian(buffer.GetWriteSpan(sizeof(int)), utf8.Length);
utf8.AsSpan().CopyTo(buffer.GetWriteSpan(utf8.Length));
using Stream file = File.OpenWrite("output.bin");
file.Write(buffer.AsSpan());
public class Encoder
{
private WriteBuffer64 buffer = new WriteBuffer64();
public void Encode()
{
byte[] utf8 = Encoding.UTF8.GetBytes("Hello World!");
BinaryPrimitives.WriteInt32BigEndian(buffer.GetWriteSpan(sizeof(int)), utf8.Length);
utf8.AsSpan().CopyTo(buffer.GetWriteSpan(utf8.Length));
}
public void WriteToFile()
{
using Stream file = File.OpenWrite("output.bin");
file.Write(buffer.AsSpan());
}
}
var encoder = new Encoder();
encoder.Encode();
encoder.WriteToFile();
If it is a public struct
that can be freely instantiated and allocated on the stack, it won't be safe. This example illustrates why.
void Main() {
var span = GetMeSpan();
var beforeMethodCall = string.Join(" ", span.ToArray());
Console.WriteLine("Just to use the stack");
var afterMethodCall = string.Join(" ", span.ToArray());
Console.WriteLine(beforeMethodCall);
Console.WriteLine(afterMethodCall);
}
Span<byte> GetMeSpan() {
var buffer = new WriteBuffer64();
return buffer.GetWriteSpan(42);
}
Output:
Just to use the stack
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 70
64 203 27 0 0 0 0 0 10 159 84 231 254 7 0 0 160 170 74 66 1 0 0 0 0 0 0 0 0 0 0 0 192 0 206 65 1 0 0 0 13 70
However, If you make the struct a private
nested type inside a class
that is used only as a field in that class (similar to your updated version of question inside the Encoder
class), I think that it's safe because
The managed pointer that the Span
holds will be updated if the class instance holding the struct is relocated on the heap during GC.
The managed pointer will keep the object holding the struct from being GCed.
EDIT:
This is corroborated by the docs for MemoryMarshal.CreateSpan<T>
:
This method can be useful if part of a managed object represents a fixed array.
You might want to check the warning/fine print there too.
Demo for the last points (compile/run in Release):
public static unsafe void Main() {
// some allocations so GC moves our object in memory
for (int i = 0; i < 10_000; i++) {
new Tuple<int>(i);
}
const int INSTANCES_COUNT = 5;
const int INDEX_LIVE = 1;
var arr = new StructHolder[INSTANCES_COUNT];
var arrWeakReferences = new WeakReference[INSTANCES_COUNT];
for (int i = 0; i < INSTANCES_COUNT; i++) {
arr[i] = new StructHolder();
arrWeakReferences[i] = new WeakReference(arr[i]);
}
Console.WriteLine("Heap object holding the struct: " + *(IntPtr*)Unsafe.AsPointer(ref arr[INDEX_LIVE]));
// e.g. 5449393336
var span = arr[INDEX_LIVE].GetWriteSpan(8);
Console.WriteLine("Address of struct inside object: " + (IntPtr)Unsafe.AsPointer(ref span[0]));
// e.g. 5449393344
// 8bytes larger on 64bit - one pointer (for MethodTable of the object)
Console.WriteLine("----");
span.Fill(42);
Console.WriteLine(string.Join("-", span.ToArray()));
for (int i = 0; i < INSTANCES_COUNT; i++) {
arr[i] = null;
}
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
Console.WriteLine("----");
Console.WriteLine("After GC");
for (int i = 0; i < INSTANCES_COUNT; i++) {
Console.WriteLine($"{i} - Is GC Alive? - {arrWeakReferences[i].IsAlive}");
if (arrWeakReferences[i].IsAlive) {
// arr[i] is now a null reference...so we get the object this way
var structHolder = arrWeakReferences[i].Target;
Console.WriteLine("Heap object holding the struct: " + *(IntPtr*)Unsafe.AsPointer(ref structHolder));
// e.g. 5398115416 ->
// if different - tweak GC dummy code above or rerun
Console.WriteLine("Address of struct inside object: " + (IntPtr)Unsafe.AsPointer(ref span[0]));
// e.g. 5398115424 - off by 8bytes again, but moved
}
// last one could be kept alive if not in Release mode
}
Console.WriteLine("-----------");
Console.WriteLine(string.Join("-", span.ToArray()));
}
public class StructHolder {
private WriteBuffer64 _buffer = new();
public Span<byte> GetWriteSpan(int toWrite) => _buffer.GetWriteSpan(toWrite);
private struct WriteBuffer64 {
private const int INLINE_BUFFER_SIZE = 64;
private unsafe fixed byte buffer[INLINE_BUFFER_SIZE];
private int used;
public unsafe Span<byte> GetWriteSpan(int toWrite) {
if (used + toWrite > INLINE_BUFFER_SIZE)
throw new Exception("");
int offset = used;
used += toWrite;
return MemoryMarshal.CreateSpan(ref buffer[offset], toWrite);
}
}
}
Example output:
Heap object holding the struct: 5461591448
Address of struct inside object: 5461591456
----
42-42-42-42-42-42-42-42
----
After GC
0 - Is GC Alive? - False
1 - Is GC Alive? - True
Heap object holding the struct: 5398109720
Address of struct inside object: 5398109728
2 - Is GC Alive? - False
3 - Is GC Alive? - False
4 - Is GC Alive? - False
-----------
42-42-42-42-42-42-42-42