I have an application that allows users to write c-sharp code that gets saved as a class library for being called later.
A new requirement has been established that some namespaces (and the methods they contain and any variables or methods with their return types) are not allowed anymore. So I need to analyze the code and alert the user to any forbidden namespaces in their code so they can remove them.
Using Roslyn, I can access the InvocationExpressionSyntax
nodes for the method calls. From that I then get the symbol info by calling var mySymbol = mySemanticModel.GetSymbolInfo(myInvocationExpressionSyntaxNode).Symbol
.
Then calling mySymbol.ContainingType.ToDisplayString()
returns the namespace type of the call.
However, it seems not all called methods have symbol information in Roslyn. For example, System.Math.Sqrt()
has symbol information, so from that I can get the containing namespace of System.Math
. On the other hand System.Net.WebRequest.Create()
or System.Diagnostics.Process.Start()
do not. How do I get System.Net.WebRequest
or System.Dignostics.Process
from those nodes? I can clearly see them using QuickWatch.
For example, the System.Diagnostics.Process.Start()
node itself shows the following value in QuickWatch:
InvocationExpressionSyntax InvocationExpression System.Diagnostics.Process.Start("CMD.exe","")
And the node's expression has this value:
MemberAccessExpressionSyntax SimpleMemberAccessExpression System.Diagnostics.Process.Start
So obviously the namespace is there in the value itself. But the Symbol from the SymbolInfo and the Type from TypeInfo are both null.
Edit
In regards to my compilation, the C# Roslyn tools are set up as follows (we are supposed to support VB as well, hence the properties are interfaced):
private class CSharpRoslynTools : IRoslynTools
{
public CompilationUnitSyntax SyntaxTreeRoot { get; }
public SemanticModel SemanticModel { get; }
public CSharpRoslynTools(string code)
{
var mscorlib = MetadataReference.CreateFromFile(typeof(object).Assembly.Location);
var syntaxTree = CSharpSyntaxTree.ParseText(code);
var compilation = CSharpCompilation.Create(
"MyCompilation",
syntaxTrees: new[] { syntaxTree },
references: new[]
{
mscorlib
});
this.SemanticModel = compilation.GetSemanticModel(syntaxTree);
this.SyntaxTreeRoot = (CompilationUnitSyntax)syntaxTree.GetRoot();
}
}
One thing I did come to realize is that System Diagnostics isn't part of the mscorlib. Could that be why the symbol information is missing?
Honestly, I kind of view this as a bit of a waste of my time because the scripts were designed to be run in a WinForms desktop application that's probably 15 years old at this point. But then they decided to this desktop application needed to move to Citrix Cloud for certain customers. And as a result, we have to lock out anything that can access the filesystem if it's not an admin logged into the application. So we have this giant potential security hole with these scripts. The chance of someone getting access to the application and exploiting any of this is slim, though.
I pushed for a blacklist, which would be easy enough to do with a simple string search of the code. They want a whitelist which requires full out parsing of the symbols.
However, it seems not all called methods have symbol information in Roslyn.
This probably indicates that something went wrong with how you got your Compilation, and you should attempt to investigate that directly. Don't attempt to deal with it downstream. (Software: garbage in, garbage out!)
On the other hand System.Net.WebRequest.Create() or System.Diagnostics.Process.Start() do not. How do I get System.Net.WebRequest or System.Dignostics.Process from those nodes? I can clearly see them using QuickWatch.
Keep in mind that from the perspective of syntax only, System.Net.WebRequest.Create() could be:
One thing I did come to realize is that System Diagnostics isn't part of the mscorlib. Could that be why the symbol information is missing?
Yep; we're only going to reference the assemblies you give us. It's up to you to know your context and if other references are included in what that code can reference, then you should include them in your production of the Compilation.
I pushed for a blacklist, which would be easy enough to do with a simple string search of the code. They want a whitelist which requires full out parsing of the symbols.
From a security perspective they may be right -- it's very difficult to block from a string search. See some thoughts at https://stackoverflow.com/a/66555319/972216 for how difficult that can be.