I am looking at writing an Accumulo iterator to return a random sample of a percentile of a table.
I would appreciate any suggestions.
Thnaks,
Chris
You can extend org.apache.accumulo.core.iterators.Filter and randomly accept x% of the entries. The following iterator would randomly return 5 percent of the entries.
import java.util.Random;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.iterators.Filter;
public class RandomAcceptFilter extends Filter {
private Random rand = new Random();
@Override
public boolean accept(Key k, Value v) {
return rand.nextDouble() < .05;
}
}