Search code examples
c#treedesign-patternsrulestraversal

How would you implement?: Lots of rules over a tree in C#


I have a data structure that represents C# code like this:

class Namespace:
    string Name;
    List<Class> Classes;

class Class:
    string Name;
    List<Property> Properties;
    List<Method> Methods;
    List<Method> Constructors;
    List<Field> Fields;
    List<Class> InnerClasses;
    Class Parent;
    List<Interface> Implements;

... which I'm building using a simple lexer/parser combination. I need to traverse the tree and apply a large set of rules (more than 3000). Rules run when encountering different (and quite complex) patterns in the tree. For example, there's a rule that runs when a class only implements interfaces in the same assembly.

My original naïve implementation iterates over each rule and then each rule traverses the tree looking for its specific pattern. Of course, this takes quite a lot of time, even with a small amount of source code.

I suppose this could be likened to how antivirus software works, recognizing complex patterns on a large body of binary code.

How would you suggest one implement this kind of software?

EDT: Just like to add: No, I'm not re-implementing FxCop.

Thanks


Solution

  • You could try to aggregate your 3000 rules. Some of the 3000, I would guess assume another member of the 3000. Say rule 12 checks 'a class implements an interface'. Rule 85 might be 'a class only implements interfaces in the same assembly'. If rule 12 fails, no need to run rule 85 at all.

    This approach (alpha-beta pruning) would either need you to restructure your algorithm to search the class tree, looking for all rules patterns at the same time. Or to stash a record that a previous rule pass has identified that the current rule pass is irrelevant.

    COMMENT: I have a nub level account so i can't comment directly. Can you give an example of maybe 2 more rules? I am currently thinking your algorithm is 0(n*n) (following copied from a big 0 notation post)

    O(n*log(n)): an algorithm that does some sort of divide and conquer strategy. Hurts for large n. Typical example: merge sort

    O(n*n): a nested loop of some sort. Hurts even with small n. Common with naive matrix calculations. You want to avoid this sort of algorithm if you can.