Search code examples
c#memorysystem.memory

Alternative to ReadOnlyMemory to access underlying index


Is there an alternative to ReadOnlyMemory<T> that allows access to its index into the underlying storage? (ReadOnlyMemory<T>._index is private).

I have a tokenizer that has sliced up a ReadOnlyMemory<char> source into tokens. Each token also has a ReadOnlyMemory<char> to represent a slice of the source. This works well, but I have a parser using the tokenizer that needs to create new slices of the original source that cross multiple tokens (e.g. from the start of one token to the end of a later token).

My workaround is for the tokens to reference a Range instead of a ReadOnlyMemory<char>, but this makes the tokenizer more complex for other clients and makes debugging the tokenizer harder. I'm considering creating my own alternative to ReadOnlyMemory<T> for this scenario, but there are several aspects that make it non-trivial.


Solution

  • ReadOnlyMemory<T> already exposes the internals via the various MemoryMarshal.TryGet* APIs. You need to handle 3 different scenarios separately:

    • arrays (TryGetArray)
    • strings (only applies to <char>, TryGetString)
    • custom memory implementations (TryGetMemoryManager)

    One of those should report true, for any valid memory chunk.

    However: if these tokens cross different memory chunks, what you might want is ReadOnlySequence<T>, which generalizes discontiguous buffers; this is basically either a single ReadOnlyMemory<T> (or similar), or a linked-list chain where each element holds a ReadOnlyMemory<T>. You need to build the chain yourself (in the multiple-segment case), but: that's trivial.

    Also note that most code that works with ReadOnlySequence<T> should check IsSingleSegment, and optimize for the single-span case - since this is so frequent, and is usually much more efficient to work with.