Search code examples
c#.netmachine-learningsvmaccord.net

In Accord.Net how to use One-Class SVM for anomaly detection?


I am trying to implement anomaly detection by using OneclassSupportVectorLearning in Accord.Net. I run into a NullReference error in the training progress. Below is my sample code in test. Appreciate if someone can help me out on this.

 double[][] inputs =
 {
    new double[] { 0, 1, 1, 0 }, //  0 
    new double[] { 0, 1, 0, 0 }, //  0
    new double[] { 0, 0, 1, 0 }, //  0
    new double[] { 0, 1, 1, 0 }, //  0
    new double[] { 0, 1, 0, 0 }, //  0
 };
 var oteacher = new OneclassSupportVectorLearning<ChiSquare,double[]>();
 var k = oteacher.Learn(inputs); //NullReference error occur here. 

EDIT---------------------------------------------------------------------

Based on Jstreet's comment, try below code but it work on 2-dim but fail at higher dimensions.

static void Main(string[] args)
{
Random r = new Random(DateTime.Now.Millisecond);

int size = 1000;
int min = 45;
int max = 55;

double[][] inputs = new double[size][];

for (int i = 0; i < size; i++)
{
    double[] d = new double[] { r.Next(min,max), r.Next(min,max), r.Next(min,max), r.Next(min,max) };
    inputs[i] = d;
}

var oteacher = new OneclassSupportVectorLearning<ChiSquare>();
var k = oteacher.Learn(inputs);

double[][] test =
 {
    // normal
    new double[] { 50, 53 , 50, 50}, 
    new double[] { 49, 52 , 50, 50},
    new double[] { 48, 51 , 50, 50},
    new double[] { 47, 52 , 50, 50},
    new double[] { 46, 53 , 50, 50},
    // anomalies
    new double[] { 50, 70, 70, 70 }, 
    new double[] { 51, 69, 70, 70 },
    new double[] { 52, 68, 70, 70 },
    new double[] { 53, 67, 70, 70 },
    new double[] { 54, 66, 70, 70 },
 };

foreach (double[] d in test)
{
    if (k.Decide(d) == true)
        Console.WriteLine(" OK = {0}, {1}, {2}, {3}", d[0], d[1], d[2], d[3]);
    else Console.WriteLine(" Anomaly = {0}, {1}, {2}, {3}", d[0], d[1], d[2], d[3]);
}

Console.ReadLine();

}


Solution

  • I suggest you to experiment with a 2-dimensional data set so you can visualize results and get a feel for it:

        static void Main(string[] args)
        {
            Random r = new Random(DateTime.Now.Millisecond);
    
            int size = 100;
            int min = 45;
            int max = 55;
    
            double[][] inputs = new double[size][];
    
            for (int i = 0; i < size; i++)
            {
                double[] d = new double[] { r.Next(min,max), r.Next(min,max) };
                inputs[i] = d;
            }
    
            var oteacher = new OneclassSupportVectorLearning<ChiSquare>();
            var k = oteacher.Learn(inputs);
    
            double[][] test =
             {
                // normal
                new double[] { 50, 53 }, 
                new double[] { 49, 52 },
                new double[] { 48, 51 },
                new double[] { 47, 52 },
                new double[] { 46, 53 },
                // anomalies
                new double[] { 50, 70 }, 
                new double[] { 51, 69 },
                new double[] { 52, 68 },
                new double[] { 53, 67 },
                new double[] { 54, 66 },
             };
    
            foreach (double[] d in test)
            {
                if (k.Decide(d) == true)
                    Console.WriteLine(" OK = {0}, {1}", d[0], d[1]);
                else Console.WriteLine(" Anomaly = {0}, {1}", d[0], d[1]);
            }
    
            Console.ReadLine();
        }
    

    This sample code generated the following output:

     OK = 50, 53 
     OK = 49, 52 
     OK = 48, 51 
     OK = 47, 52 
     OK = 46, 53 
     Anomaly = 50, 70 
     Anomaly = 51, 69 
     Anomaly = 52, 68 
     Anomaly = 53, 67 
     Anomaly = 54, 66
    

    And this is the graphical view of the same result:

    enter image description here


    EDIT: Like I said, it takes some experimentation. Here's my result for a 4-dimensional input data set. Notice that i decreased how variable each dimension is and kept the same input size, 100.

        static void Main(string[] args)
        {
            Random r = new Random(DateTime.Now.Millisecond);
    
            int size = 100;
            int min = 45;
            int max = 50;
            int min2 = 60;
            int max2 = 65;
    
            double[][] inputs = new double[size][];
    
            for (int i = 0; i < size; i++)
            {
                double[] d = new double[] { r.Next(min, max), r.Next(min, max), r.Next(min, max), r.Next(min, max) };
                inputs[i] = d;
            }
    
            var oteacher = new OneclassSupportVectorLearning<ChiSquare>();
            var k = oteacher.Learn(inputs);
    
            double[][] test =
             {
                // normal
                new double[] {  r.Next(min, max),  r.Next(min, max), r.Next(min, max),  r.Next(min, max) },
                new double[] {  r.Next(min, max),  r.Next(min, max), r.Next(min, max),  r.Next(min, max) },
                new double[] {  r.Next(min, max),  r.Next(min, max), r.Next(min, max),  r.Next(min, max) },
                new double[] {  r.Next(min, max),  r.Next(min, max), r.Next(min, max),  r.Next(min, max) },
                new double[] {  r.Next(min, max),  r.Next(min, max), r.Next(min, max),  r.Next(min, max) },
                // anomalies
                new double[] {  r.Next(min2, max2),  r.Next(min2, max2), r.Next(min2, max2),  r.Next(min2, max2) },
                new double[] {  r.Next(min2, max2),  r.Next(min2, max2), r.Next(min2, max2),  r.Next(min2, max2) },
                new double[] {  r.Next(min2, max2),  r.Next(min2, max2), r.Next(min2, max2),  r.Next(min2, max2) },
                new double[] {  r.Next(min2, max2),  r.Next(min2, max2), r.Next(min2, max2),  r.Next(min2, max2) },
                new double[] {  r.Next(min2, max2),  r.Next(min2, max2), r.Next(min2, max2),  r.Next(min2, max2) },
             };
    
            foreach (double[] d in test)
            {
                if (k.Decide(d) == true)
                    Console.WriteLine("OK = {0}, {1}, {2}, {3}", d[0], d[1], d[2], d[3]);
                else Console.WriteLine("Anomaly = {0}, {1}, {2}, {3}", d[0], d[1], d[2], d[3]);
            }
    
            Console.ReadLine();
        }
    

    And the result:

    OK = 49, 46, 47, 49
    OK = 49, 45, 45, 47
    OK = 45, 45, 46, 47
    OK = 47, 49, 47, 48
    OK = 45, 45, 47, 48
    Anomaly = 62, 60, 61, 63
    Anomaly = 61, 63, 63, 64
    Anomaly = 64, 60, 60, 64
    Anomaly = 61, 64, 63, 63
    Anomaly = 62, 60, 62, 62