I am working on a simulator in which Person objects (stored in an ArrayList) "reproduce" and make babies, and they inherit "genes", represented as 4-letter strings. At the program start, the gene pool for the first people is randomly generated.
At every tick of the timer, I want to calculate what the most common "gene" among all the Person objects is.
The four letters are:
1. G, Z, N, F
2. A, T, C, G
3. B, F, Q, N
4. A, C, T, E
There are 256 possible combinations in this case, and there has to be a more efficient check than 256 if-else statements.
The Person class (minus get/set methods)
public class Person {
static Random rand = new Random();
private Person mother;
private Person father;
private String genes;
private char sex;
private int age, numKids;
public Person() {
mother = null;
father = null;
genes = createGenes();
if (rand.nextDouble() <= 0.5)
sex = 'm';
else
sex = 'f';
age = 18;
numKids = 0;
}
public Person(Person m, Person f) {
mother = m;
father = f;
genes = inheritGenes(m, f);
if (rand.nextDouble() <= 0.5)
sex = 'm';
else
sex = 'f';
age = 0;
}
//create genes for original Persons
private String createGenes() {
String genetics = "";
double first = rand.nextDouble();
double second = rand.nextDouble();
double third = rand.nextDouble();
double fourth = rand.nextDouble();
if (first <= 0.25)
genetics += "G";
else if (first <= 0.68)
genetics += "Z";
else if (first <= 0.9)
genetics += "N";
else
genetics += "F";
if (second <= 0.65)
genetics += "A";
else if (second <= 0.79)
genetics += "T";
else if (second <= 0.85)
genetics += "C";
else
genetics += "G";
if (third <= 0.64)
genetics += "B";
else if (third <= 0.95)
genetics += "F";
else if (third <= 0.98)
genetics += "Q";
else
genetics += "N";
if (fourth <= 0.37)
genetics += "A";
else if (fourth <= 0.58)
genetics += "C";
else if (fourth <= 0.63)
genetics += "T";
else
genetics += "E";
return genetics;
}
//inherit genes from parents for new Persons
public String inheritGenes(Person m, Person f) {
String genetics = "";
double first = rand.nextDouble();
double second = rand.nextDouble();
double third = rand.nextDouble();
double fourth = rand.nextDouble();
if (first < 0.5) {
genetics += m.getGenes().charAt(0);
} else
genetics += f.getGenes().charAt(0);
if (second < 0.5) {
genetics += m.getGenes().charAt(1);
} else
genetics += f.getGenes().charAt(1);
if (third < 0.5) {
genetics += m.getGenes().charAt(2);
} else
genetics += f.getGenes().charAt(2);
if (fourth < 0.5) {
genetics += m.getGenes().charAt(3);
} else
genetics += f.getGenes().charAt(3);
return genetics;
}
}
Sample code that finds the most common gene from a List<Person>
. I just added a getter for the genes
String
:
String getGenes() {
return genes;
}
The code is as follows:
List<Person> people = new ArrayList<>();
for (int i = 0; i < 100; i++) {
people.add(new Person()); // 100 random genes
}
String mostCommonGene = people.stream()
.map(Person::getGenes)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.max(Comparator.comparingLong(Map.Entry::getValue))
.get()
.getKey();
System.out.println("Most common gene: " + mostCommonGene);
We use Java 8 Streams:
stream()
of people
list.map()
(transform) every Person
to String
- their genes
.collect()
the stream of genes with groupingBy()
fed by Function.identity()
and Collectors.counting()
. This step produces a Map<String, Long>
which represents a map of genes
and their frequencies. Effectively, this counts the occurrences of the genes from people
list.entrySet()
on that map and then stream()
again - now we have a stream of map entries (you can think of them as pairs - the gene and its frequency inside one object. Convenient).max()
to find the entry with the highest value (interpreted as frequency). Comparator.comparingLong()
tells the max()
algorithm how we compare the pairs, but the pairs are not longs - that's why we have to tell it how to convert the entry to a long
- we get the value of that entry.get()
, since max()
returns an Optional<T>
. We just want the T
(the entry).getKey()
on the entry that represents a pair of the most frequent gene and its frequency. A key is the gene and the value is its frequency, as previously mentioned.If you are unfamiliar with most concepts described in this answer, I highly suggest you learning about Java 8 Streams. Once you get used to them, you can't stop.