I've got some really large memory dumps of a managed process that I'm trying to get a lot of statistics from--as well as be able to present an interactive view of--fairly deep object graphs on the heap. Think something comparable to !do <address>
with prefer_dml 1
set in WinDbg with SOS, where you can continually click on the properties and see their values, only in a much friendlier UI for comparing many objects.
I've found Microsoft.Diagnostics.Runtime (ClrMD) to be particularly well suited for this task, but I'm having a hard time working with array fields and I'm a little confused about object fields, which I have working a little better.
Array:
If I target an array with an address directly off the heap and use ClrType.GetArrayLength
and ClrType.GetArrayElementValue
things work fine, but once I'm digging through the fields on another object, I'm not sure what value I'm getting from ClrInstanceField.GetValue
when the ClrInstanceField.ElementType
is ClrElementType.SZArray
(I haven't encountered Array
digging around in my object graph yet, but I should like to handle it as well).
Edit: I just decided to use the ClrType
for System.UInt64
to dereference the array field (using parent address + offset of the array field
to calculate the address where the array pointer is stored), then I can work with it the same as if I got it from EnumerateObjects. I am now having some difficulty with some arrays not supporting the ArrayComponentType
property. I have yet to test with arrays of Structs so I am also wondering if that will be a C-style allocation of inline structs, as it is with int[]
or if it will be an array of pointers to structs on the heap. Guid[]
is one of the types I'm having an issue getting the ArrayComponentType
from.
Object: Fixed (logic error)
With a ClrInstanceField
that has a Type
of ClrElementType.Object
I get much better results, but still need a little more. Firstly, after calling GetFieldValue
I get back a ulong
address(?) which I can use ClrInstanceField.Type.Fields
against just fine, so I can see the field names and values of the nested object. That said, I have to account for polymorphism, so I tried using ClrHeap.GetObjectType
on the same address and it either returns NULL or something completely incorrect. It seems odd that the address would work in my first use case, but not the second.
String: Fixed (found workaround)
Because my real project already uses DbgEng w/ SOS, I have a different way to easily get the value of strings by address, but it seemed very odd that trying to use ClrInstanceField.GetFieldValue
succeeded in returning a string, but with completely inaccurate results (a bunch of strange characters). Maybe I'm doing this wrong?
Edit: I have extracted an abstraction that now runs in LINQPad from my original code. It's a bit long to post here, but it's all here in a gist. It's still a little messy from all the copy/paste/refactor and I'll be cleaning it up further an likely posting the final source on either CodePlex or GitHub after I've got these issues fixed.
The code base is fairly large and specific to a project, but if it's absolutely necessary I may be able to extract out a sample set. That said, all access to the ClrMD objects is fairly simple. I get the initial addresses from SOS commands like !dumpheap -stat
(which works fine for the root objects) and then I use ClrHeap.GetTypeByName
or ClrHeap.GetObjectType
. After that it relies exclusively on ClrType.Fields
and ClrInstanceField
members Type
, ElementType
, and GetFieldValue
As an added bonus, I did find a browser friendly version of the XML Docs provided with the NuGet package, though it's the same documentation IntelliSense provides.
It's going to be hard to answer very precisely without seeing what your code looks like, but basically, it goes like this:
The first thing you need to know in order to be able to call GetFieldAddress/GetFieldValue is if the object address you have is a regular pointer or an interior pointer. That is, if it directly points to an object on the heap, or to an interior structure within an actual object (think String vs. Struct field within an actual object).
If you're getting the wrong values out of GetFieldAddress/GetFieldValue, it usually means you're not specifying that you have an interior pointer (or you thought you had one when you didn't).
The second part is understanding what the values mean.
If field.IsPrimitive() is true: GetFieldValue() will get you the actual primitive value (i.e. an Int32, Byte, or whatever)
If field.IsValueClass() is true, then GetFieldAddress() will get you an interior pointer to the structure. Thus, any calls on GetFieldAddress/Value() that you use on that address you need to tell it that it is an interior pointer!
If field.ElementType is a ClrElementType.String, then I seem to remember you need to call GetFieldValue will get you the actual string contents (need to check, but this should be it).
Otherwise, you have an object reference, in which case GetFieldValue() will get you a regular pointer to the new reference object.
Does this make sense?