I'm recreating something similar to pandas dataframe in java to read csv files and manipulate data. I have everything coded as generic to handle any type of column in a csv file as well as auto declaration to wrapper classes such as Integer and Double if it is a number. The problem is that now I'm writing functions that will only concern Numeric columns, but I still need to do a decent amount of casting to get the actual values which I would like to find a more elegant solution to.
I have tried casting within the methods and it works but I'm looking for a way to just return the numeric value if it is a number within the column class to avoid doing this for future functions:
//the basic structure
public class Column<T> {
public String type; //column type
public String name; //column name
public ArrayList<T> values; //array of values
...
public T getValue(int index) {
return values.get(index);
}
}
//in another file is the problem
public static double variance(Column c) {
double mean = mean(c);
double var = 0;
for(int i = 0;i < c.getLength();i++) {
// here is the problem
var = Math.pow((((Number) c.getValue(i)).doubleValue()-mean),2);
}
return var/c.getLength();
}
If you have the freedom to modify your Column
class or make another, more specific subclass, instead of doing a cast outside the object, you could add methods to return a double internally if it's a double, an int if it's an int, etc, since in your example you know that Column
is a Column<Number>
. For example:
public class DoubleColumn extends Column<Number> {
@Override
public Double getValue(int index) {
return super.getValue(index).doubleValue();
}
}
Then you can modify your variance method accordingly to take a DoubleColumn
instead of a Column
.