Search code examples
c#aopcilmono.cecilcompile-time-weaving

ILWeaving help - [ValidSystemPath] attribute


Problem

I'm using Mono.Cecil to IL Weave string property getters that have my custom [ValidSystemPath] attribute on them. The purpose of the attribute is to ensure the property only ever returns valid system characters for file names and paths etc. Problem is, the code is not currently working, yet is not raising any exceptions during weaving. I'm new to weaving and IL, so I'd benefit from a guiding-hand.

Code before weaving (C#)

private string path = "test|.txt";
[ValidSystemPath]
public string Path => path;

Expected code after weaving (C#)

This is effectively my source of inspiration for the code I'm trying to weave in... https://stackoverflow.com/a/23182807/1995360

public string Path {
    get {
        string ReplaceIllegal(string p)
        {
            char[] invalid = Path.GetInvalidFileNameChars().Concat(Path.GetInvalidPathChars()).ToArray();
            return string.Join("_", p.Split(invalid));
        }
        
        return ReplaceIllegal(path);
    }
}

I'd like to use a nested method, because if the getter contains conditionals, there could be multiple return statements, thus I need to make a simple call to the nested method before each return statement.

Weaver code (C#)

private static void ValidSystemPath(ModuleDefinition module, TypeDefinition type, PropertyDefinition property)
{
    // Getter method - site of injection
    MethodDefinition getter = property.GetMethod;
    ILProcessor getterProcessor = getter.Body.GetILProcessor();

    // Import the methods
    MethodReference joinMethod = module.ImportReference(typeof(string).GetMethod("Join", new Type[] { typeof(string), typeof(string[]) }));
    MethodReference splitMethod = module.ImportReference(typeof(string).GetMethod("Split", new Type[] { typeof(char[]) }));
    MethodReference getInvalidPathCharsMethod = module.ImportReference(typeof(Path).GetMethod("GetInvalidPathChars", new Type[] { }));
    MethodReference getInvalidFileNameCharsMethod = module.ImportReference(typeof(Path).GetMethod("GetInvalidFileNameChars", new Type[] { }));
    MethodReference concatMethod = module.ImportReference(typeof(Enumerable).GetMethod("Concat"));
    MethodReference toArrayMethod = module.ImportReference(typeof(Enumerable).GetMethod("ToArray"));
    //MethodReference toArrayMethod = module.ImportReference(typeof(Enumerable).GetMethodExt("ToArray", new Type[] { typeof(IEnumerable<char>) }));

    // Create new nested method in getter
    MethodDefinition nested = new(
        $"<{getter.Name}>g__ReplaceIllegalChars|2_0",
        Mono.Cecil.MethodAttributes.Assembly | Mono.Cecil.MethodAttributes.HideBySig | Mono.Cecil.MethodAttributes.Static,
        module.TypeSystem.String
    );
    type.Methods.Add(nested);

    // Write instructions for method
    ILProcessor nestedProcessor = nested.Body.GetILProcessor();
    nestedProcessor.Emit(OpCodes.Nop);
    nestedProcessor.Emit(OpCodes.Call, getInvalidFileNameCharsMethod);
    nestedProcessor.Emit(OpCodes.Call, getInvalidPathCharsMethod);
    nestedProcessor.Emit(OpCodes.Call, concatMethod);
    nestedProcessor.Emit(OpCodes.Call, toArrayMethod);
    nestedProcessor.Emit(OpCodes.Stloc_0); // Return value is top stack
    nestedProcessor.Emit(OpCodes.Ldstr, "_");
    nestedProcessor.Emit(OpCodes.Ldarg_0);
    nestedProcessor.Emit(OpCodes.Ldloc_0);
    nestedProcessor.Emit(OpCodes.Callvirt, splitMethod); // Non static
    nestedProcessor.Emit(OpCodes.Call, joinMethod);
    nestedProcessor.Emit(OpCodes.Stloc_1);
    nestedProcessor.Emit(OpCodes.Ldloc_1);

    //getterProcessor.Body.SimplifyMacros();
    // Add nested call before each return
    IEnumerable<Instruction> returnInstructions = getterProcessor.Body.Instructions.Where(instruction => instruction.OpCode == OpCodes.Ret);
    returnInstructions.ToList().ForEach(ret => getterProcessor.InsertBefore(ret, Instruction.Create(OpCodes.Call, nested)));
    /*foreach (Instruction ret in returnInstructions)
    {
        // Call nested method and return that value
        getterProcessor.InsertBefore(ret, Instruction.Create(OpCodes.Call, nested));
    }*/
    //getterProcessor.Body.OptimizeMacros();
}

Breakdown of weaver

  1. Find the site of injection and create an ILProcessor.
  2. Import method references for all the method calls we'll be making (string Split/Join, Path GetInvalidPathChars/GetInvalidFileNameChars, and Enumerable Concat/ToArray).
  3. Create the nested method and add.
  4. Emit the method body.
  5. Add nested method calls before each return statement. (there's a lot of code commented out here, as I was testing the best way to insert each method call. I've also tried using SimplifyMacros() and OptimizeMacros() but was unsure what they did so commented out).

Expected / Actual runtime output

"test_.txt" / "test|.txt"

Thank you for any help you can provide me in getting this code working.


Solution

  • As I've mentioned in the comment, it's not required to have valid IL to be able to save the file. This is sometimes used by some obfuscators and IL is only fixed before the method get executed. If this is not what you want, you need to be sure that what you are doing is producing correct IL, that will be correctly executed by the runtime.

    The best approach (IMO) is to write the code you want to generate and see the generate IL in ILSpy/dnSpy.

    Problems with the your code are:

    Missing ret statement.

    Just add nestedProcessor.Emit(OpCodes.Ret); at the end of the generated ILs.

    Using arguments

    In line nestedProcessor.Emit(OpCodes.Ldarg_0); you are loading the argument 0, but there's no arguments defined. Add nested.Parameters.Add(new ParameterDefinition(module.TypeSystem.String)); to indicate that this function takes one argument of type string.

    Using locals

    In lines nestedProcessor.Emit(OpCodes.Stloc_0); and nestedProcessor.Emit(OpCodes.Stloc_1); you are using local variables, but those are not defined too.

    Add lines

    nested.Body.Variables.Add(new VariableDefinition(module.ImportReference(typeof(char[])));
    nested.Body.Variables.Add(new VariableDefinition(module.TypeSystem.String));
    

    to indicate that this method has 2 local variables, first of type char[] and second of type string.

    Generics

    In your code, you use generics (Concat and ToArray calls) and those need to be specialized before being used in IL. Call MakeGenericMethod providing the generic type, to correctly specialized them before use.

    var concat = typeof(Enumerable).GetMethod("Concat");
    var conact_spec = concat.MakeGenericMethod(typeof(char));
    MethodReference concatMethod = module.ImportReference(conact_spec);
    
    var toArray = typeof(Enumerable).GetMethod("ToArray");
    var toArray_spec = toArray.MakeGenericMethod(typeof(char));
    MethodReference toArrayMethod = module.ImportReference(toArray_spec);