I feel like there should be an available library to more simply do two things, A) Find the mode to an array, in the case of doubles and B) gracefully degrade the precision until you reach a particular frequency.
So imagine an array like this:
double[] a = {1.12, 1.15, 1.13, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4};
If I was looking for a frequency of 3 then it would go from 2 decimal positions to 1 decimal, and finally return 1.1 as my mode. If I had a frequency requirement of 4 it would return 4 as my mode.
I do have a set of code that is working the way I want, and returning what I am expecting, but I feel like there should be a more efficient way to accomplish this, or an existing library that would help me do the same. Attached is my code, I'd be interested in thoughts / comments on different approaches I should have taken....I have the iterations listed to limit how far the precision can degrade.
public static double findMode(double[] r, int frequencyReq)
{
double mode = 0d;
int frequency = 0;
int iterations = 4;
HashMap<Double, BigDecimal> counter = new HashMap<Double, BigDecimal>();
while(frequency < frequencyReq && iterations > 0){
String roundFormatString = "#.";
for(int j=0; j<iterations; j++){
roundFormatString += "#";
}
DecimalFormat roundFormat = new DecimalFormat(roundFormatString);
for(int i=0; i<r.length; i++){
double element = Double.valueOf(roundFormat.format(r[i]));
if(!counter.containsKey(element))
counter.put(element, new BigDecimal(0));
counter.put(element,counter.get(element).add(new BigDecimal(1)));
}
for(Double key : counter.keySet()){
if(counter.get(key).compareTo(new BigDecimal(frequency))>0){
mode = key;
frequency = counter.get(key).intValue();
log.debug("key: " + key + " Count: " + counter.get(key));
}
}
iterations--;
}
return mode;
}
Edit
Another way to rephrase the question, per Paulo's comment: the goal is to locate a number where in the neighborhood are at least frequency
array elements, with the radius of the neighborhood being as small as possible.
Here a solution to the reformulated question:
The goal is to locate a number where in the neighborhood are at least
frequency
array elements, with the radius of the neighborhood being as small as possible.
(I took the freedom of switching the order of 1.15
and 1.13
in the input array.)
The basic idea is: We have the input already sorted (i.e. neighboring elements are consecutive), and we know how many elements we want in our neighborhood. So we loop once over this array, measuring the distance between the left element and the element frequency
elements more to the right. Between them are frequency
elements, so this forms a neighbourhood. Then we simply take the minimum such distance. (My method has a complicated way to return the results, you may want to do it better.)
This is not completely equivalent to your original question (does not work by fixed steps of digits), but maybe this is more what you really want :-)
You'll have to find a better way of formatting the results, though.
package de.fencing_game.paul.examples;
import java.util.Arrays;
/**
* searching of dense points in a distribution.
*
* Inspired by http://stackoverflow.com/questions/5329628/finding-a-mode-with-decreasing-precision.
*/
public class InpreciseMode {
/** our input data, should be sorted ascending. */
private double[] data;
public InpreciseMode(double ... data) {
this.data = data;
}
/**
* searchs the smallest neighbourhood (by diameter) which
* contains at least minSize elements.
*
* @return an array of two arrays:
* { { the middle point of the neighborhood,
* the diameter of the neighborhood },
* all the elements of the neigborhood }
*
* TODO: better return an object of a class encapsuling these.
*/
public double[][] findSmallNeighbourhood(int minSize) {
int currentLeft = -1;
int currentRight = -1;
double currentMinDiameter = Double.POSITIVE_INFINITY;
for(int i = 0; i + minSize-1 < data.length; i++) {
double diameter = data[i+minSize-1] - data[i];
if(diameter < currentMinDiameter) {
currentMinDiameter = diameter;
currentLeft = i;
currentRight = i + minSize-1;
}
}
return
new double[][] {
{
(data[currentRight] + data[currentLeft])/2.0,
currentMinDiameter
},
Arrays.copyOfRange(data, currentLeft, currentRight+1)
};
}
public void printSmallNeighbourhoods() {
for(int frequency = 2; frequency <= data.length; frequency++) {
double[][] found = findSmallNeighbourhood(frequency);
System.out.printf("There are %d elements in %f radius "+
"around %f:%n %s.%n",
frequency, found[0][1]/2, found[0][0],
Arrays.toString(found[1]));
}
}
public static void main(String[] params) {
InpreciseMode m =
new InpreciseMode(1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1,
4.2, 4.3, 4.4);
m.printSmallNeighbourhoods();
}
}
The output is
There are 2 elements in 0,005000 radius around 1,125000:
[1.12, 1.13].
There are 3 elements in 0,015000 radius around 1,135000:
[1.12, 1.13, 1.15].
There are 4 elements in 0,150000 radius around 4,250000:
[4.1, 4.2, 4.3, 4.4].
There are 5 elements in 0,450000 radius around 3,850000:
[3.4, 3.44, 4.1, 4.2, 4.3].
There are 6 elements in 0,500000 radius around 3,900000:
[3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 7 elements in 1,200000 radius around 3,200000:
[2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 8 elements in 1,540000 radius around 2,660000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2].
There are 9 elements in 1,590000 radius around 2,710000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3].
There are 10 elements in 1,640000 radius around 2,760000:
[1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].