Search code examples
c#.netvirus-scanningclamnclam

Virus scanning files with Japanese Characters in C# using nClam and ClamAV


We've been working on implementing a virus scanning procedure for files being uploaded into our system - it's a C# web app where we've used the nClam plugin to converse with a ClamAV server. In a really basic setup (copied from the nClam help), it looks like this

     string filePath = "C:\\test\\jp TEST 昨夜のコンサート.txt";

        var clam = new ClamClient("localhost", 3310);
        var scanResult = clam.ScanFileOnServer(filePath);  //any file you would like!

        Console.WriteLine("Japan test");
        switch (scanResult.Result)
        {
            case ClamScanResults.Clean:
                Console.WriteLine("The file is clean!");
                break;
            case ClamScanResults.VirusDetected:
                Console.WriteLine("Virus Found!");
                Console.WriteLine("Virus name: {0}", scanResult.InfectedFiles.First().VirusName);
                break;
            case ClamScanResults.Error:
                Console.WriteLine("Woah an error occured! Error: {0}", scanResult.RawResult);
                break;
        }

Now when I try to run this, I will always get an error back from the scan - the file itself is just a text file with some random characters in. The error I get is "No such file or directory. ERROR"

If I run the ClamAV console command to scan files in a folder, it seems to work OK. I think it's an issue of decoding the path but I've tried various encoding schemes and it doesn't seem to work.

Within the nClam method, there's some code that parses the filepath into a command for the ClamAV server

var commandText = String.Format("z{0}\0", command);
var commandBytes = Encoding.UTF8.GetBytes(commandText);

Could this be affecting the Japanese characters?


Solution

  • The solution I've got with is to just replace out all the Japanese (and other non-supported) characters in the filename and use the real filename after the virus scan i.e.

    private string ReplaceUnsupportedCharacters(string fileName)
    {
      const int MaxAnsiCode = 255;
      foreach (var illegalChar in fileName.Where(c => c > MaxAnsiCode))
      {
        fileName = fileName.Replace(illegalChar, '-');
      }
      return fileName;
    }
    

    I'd rather not have to do this but right now I can't see a better way!