Search code examples
javaderby

Creating a User Defined Aggregate function in Apache Derby


I need to create a User Defined Aggregate (UDA) function in Derby (specifically, a Variance function), but I'm kinda stuck with the proper way to write it.

So far, I have the "template" for the UDA:

public class Variance<V extends Comparable<V>> implements Aggregator<V,V,Variance<V>> {
    private ArrayList<V> _values;
    public Variance() {/*Empty constructor*/}
    public void init() {_values = new ArrayList<V>(); }
    public void accumulate(V v) { _values.add(v); }
    public void merge(Variance<V> o) { _values.addAll(o._values); }
    public V terminate() {
        // Here is my issue!!!
    }
}

The issue I'm facing is this: To compute the variance I need to calculate something like this:

V sum, sumSq;
int n = _values.size();
if(n <= 0)
    return null;
// Initialize sum and sumSq to "zero"
for(V v : _values) {
    sum += v; // Or somehow add v to sum
    sumSq += Math.pow(v, 2); // Or somehow add (v^2) to sumSq
}
return (sumSq - n * Math.pow(sum / n, 2)) / n;

... but I don't know how to tell that this is only valid for numeric types (integer, decimal or floating-point values).

I think I'm missing something in my code, but I don't know if there's a way to tell this program that V is numeric, and thus, I can use arithmetic operations on values of type V.

So, the specific questions are:

  • Is there a way to perform this operations (addition, substraction, product, power) on the values?
  • Should I change the definition of V (somewhat making it extend a "numeric" class, like Double)?

Solution

  • Looking at Derby's data types:

    DECIMAL = java.math.BigDecimal
    INTEGER = java.lang.Integer
    FLOAT = java.lang.Float or java.lang.Double 
    

    FLOAT will convert to different Java objects depending on the precision you specify when you create it, the default being java.lang.Double

    Adding numbers

    The first problem you have is simply summing values together (lots of bad operand types for binary operator '+' errors). Also even if you could get Integer, Float and Double to work you would find that because BigDecimal does not map directly to a Java primitive it does not work with the standard primitive arithmetic operators and because of this has it's own add method on the object.

    To quote, Mark Peters answer to a similar issue;

    There are ways you can hack this together but in all honestly, generics is simply not the way to go here. Build a method for each concrete primitive wrapper type and implement them separately. It'll be way too much of a headache to make it generic; arithmetic operations can't happen generically.

    Squaring values

    The second problem you have is that you are using the power method.

    Math.pow() works with double arguments - so Integer or Double calculations should be okay. Float may display unexpected results as converting a float to a double can result in strange extra digits in the converted value.

    The result type for Math.pow is a Double, which you could workaround by defining your Aggregator so that the terminate method result type is always a Double, e.g.:

    public class Variance<V extends Number & Comparable<V>> 
        implements Aggregator<V, Double, Variance<V>> {
    

    As we have seen before BigDecimal is a little different to the others and has its own pow() method whose result type is a BigDecimal value.

    Conclusion

    Given the above I would suggest that don't attempt a generic solution but instead implement multiple variance aggregators for each class type that you want to support. e.g. your could implement something like:

    Aggregator<Integer, Double, Variance<Integer>>
    Aggregator<Double, Double, Variance<Double>>
    Aggregator<Float, Double, Variance<Float>>
    Aggregator<BigDecimal, BigDecimal, Variance<BigDecimal>>