Search code examples
javaoutliers

Use boolean to search for outliers of standard dev and then print the outliers using java


What I'm trying to do is figure out a way to print the outliers of standard deviation here. The outliers are defined as having a variance greater than 2x the standard deviation. I can't figure out how but I've started by creating a boolean flag, however I don't understand the dynamics of this. Could someone please help me figure out how to print out the outliers somehow? Thanks.

public class Main {

public static void main(String[] args)
{
    {
        Algebra n = new Algebra();
        System.out.println(" ");
        System.out.println("The maximum number is: " + n.max);
        System.out.println("The minimum is: " + n.min);
        System.out.println("The mean is: " + n.avg);
        System.out.println("The standard deviation is " + n.stdev);

        }
}
}

2nd part:

public class Algebra 
{
static int[] n = createArray();
int max = displayMaximum(n);
int min = displayMinimum(n);
double avg = displayAverage(n);
double stdev = displayStdDev(n);

public boolean outliers() {
    for(int i = 0; i < n.length; i++)
    {
    boolean flag = (n[i] < stdev*2);
    }
    return          
}


public Algebra()
{
this(n);
System.out.println("The numbers that are outliers are ");
for(int i = 0; i < n.length; i++) 
{
System.out.print(" " + (n[i] < stdev*2));

}
}
public Algebra(int[] n)
{
    createArray();
}
public static int[] createArray() 
{
        int[] n = new int[100];
        for(int i = 0; i < n.length; i++)
        n[i] = (int)(Math.random()*100 + 1);
        return n;
}
public int displayMaximum(int[] n)
{
    int maxValue = n[0]; 
    for(int i=1; i < n.length; i++){ 
      if(n[i] > maxValue){ 
         maxValue = n[i]; 
      } 
    } 
    return maxValue;
            }
public int displayMinimum(int[] n)
{ 
    int minValue = n[0]; 
    for(int i=1;i<n.length;i++){ 
      if(n[i] < minValue){ 
        minValue = n[i]; 
      } 
    } 
    return minValue; 
}
protected double displayAverage(int[] n) 
{
    int sum = 0;
    double mean = 0;
    for (int i = 0; i < n.length; i++) {
    sum += n[i];
    mean = sum / n.length;
    }
    return mean;
}
protected double displayStdDev(int[] n)
{
    int sum = 0;
    double mean = 0;
    for (int i = 0; i < n.length; i++) {
    sum = sum + n[i];
    mean = sum/ n.length;
    }
    double squareSum = 0.0;

    for (int i = 0; i < n.length; i++)
    {
        squareSum += Math.pow(n[i] - mean, 2);
    }
    return Math.sqrt((squareSum) / (n.length - 1));

}
}

Solution

  • Variance is defined as the squared difference from the mean. This is a fairly straight forward calculation.

    public static double variance(double val, double mean) {
       return Math.pow(val - mean, 2);
    }
    

    You define an outlier as an instance that has a variance greater than x2 the standard deviation.

    public static boolean isOutlier(double val, double mean, double std) {
       return variance(val, mean) > 2*std;
    }
    

    You then just need to iterate through the values and print any values that are evaluated as an outlier.

    public void printOutliers() {
        for (int i : n) {
            if (isOutlier(i, avg, stdev)) {
                ...
            }
        }
    }
    

    You should note that if one value is defined as an outlier and subsequently removed, values previously classified as an outlier may no longer be. You may also be interested in the extent of an outlier in the current set; One value may be an outlier to a greater extent than another.