I have converted this Java program into a C# program.
using System;
using System.Collections.Generic;
namespace RandomNumberWith_Distribution__Test
{
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
public DistributedRandomNumberGenerator()
{
distribution = new Dictionary<Int32, Double>();
}
public void addNumber(int val, double dist)
{
distribution.Add(val, dist);// are these two
distSum += dist; // lines correctly translated?
}
public int getDistributedRandomNumber()
{
double rand = new Random().NextDouble();//generate a double random number
double ratio = 1.0f / distSum;//why is ratio needed?
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)//what does "rand/ratio" signify? What does this comparison achieve?
{
return i;
}
}
return 0;
}
}
public class MainClass
{
public static void Main(String[] args)
{
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
//=================
// start simulation
int testCount = 1000000;
Dictionary<Int32, Double> test = new Dictionary<Int32, Double>();
for (int i = 0; i < testCount; i++)
{
int random = drng.getDistributedRandomNumber();
if (test.ContainsKey(random))
{
double prob = test[random]; // are these
prob = prob + 1.0 / testCount;// three lines
test[random] = prob; // correctly translated?
}
else
{
test.Add(random, 1.0 / testCount);// is this line correctly translated?
}
}
foreach (var item in test.Keys)
{
Console.WriteLine($"{item}, {test[item]}");
}
Console.ReadLine();
}
}
}
I have several questions:
getDistributedRandomNumber()
check if the sum of the distribution 1
before proceeding to further calculations?The method
public void addNumber(int val, double dist)
Is not correctly translated, since you are missing the following lines:
if (this.distribution.get(value) != null) {
distSum -= this.distribution.get(value);
}
Those lines should cover the case when you call the following (based on your example code):
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(1, 0.5d);
So calling the method addNumber
twice with the same first argument, the missing code part looks if the first argument is already present in the dictionary and if so it will remove the "old" value from the dictionary to insert the new value.
Your method should look like this:
public void addNumber(int val, double dist)
{
if (distribution.TryGetValue(val, out var oldDist)) //get the old "dist" value, based on the "val"
{
distribution.Remove(val); //remove the old entry
distSum -= oldDist; //substract "distSum" with the old "dist" value
}
distribution.Add(val, dist); //add the "val" with the current "dist" value to the dictionary
distSum += dist; //add the current "dist" value to "distSum"
}
Now to your second method
public int getDistributedRandomNumber()
Instead of calling initializing a new instance of Random
every time this method is called you should only initialize it once, so change the line
double rand = new Random().NextDouble();
to this
double rand = _random.NextDouble();
and initialize the field _random
outside of a method inside the class declaration like this
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
private Random _random = new Random();
... rest of your code
}
This will prevent new Random().NextDouble()
from producing the same number over and over again if called in a loop.
You can read about this problem here: Random number generator only generating one random number
As I side note, fields in c# are named with a prefix underscore. You should consider renaming distribution
to _distribution
, same applies for distSum
.
Next:
double ratio = 1.0f / distSum;//why is ratio needed?
Ratio is need because the method tries its best to do its job with the information you have provided, imagine you only call this:
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
int random = drng.getDistributedRandomNumber();
With those lines you told the class you want to have the number 1
in 20%
of the cases, but what about the other 80%
?
And that's where the ratio variable comes in place, it calculates a comparable value based on the sum of probabilities you have given. eg.
double ratio = 1.0f / distSum;
As with the latest example drng.addNumber(1, 0.2d);
distSum
will be 0.2
, which translates to a probability of 20%
.
double ratio = 1.0f / 0.2;
The ratio is 5.0
, with a probability of 20%
the ratio is 5 because 100% / 5 = 20%
.
Now let's have a look at how the code reacts when the ratio is 5
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)
{
return i;
}
}
rand
will be to any given time a value that is greater than or equal to 0.0, and less than 1.0., that's how NextDouble
works, so let's assume the following 0.254557522132321
as rand
.
Now let's look what happens step by step
double tempDist = 0; //initialize with 0
foreach (Int32 i in distribution.Keys) //step through the added probabilities
{
tempDist += distribution[i]; //get the probabilities and add it to a temporary probability sum
//as a reminder
//rand = 0.254557522132321
//ratio = 5
//rand / ratio = 0,0509115044264642
//tempDist = 0,2
// if will result in true
if (rand / ratio <= tempDist)
{
return i;
}
}
If we didn't apply the ratio the if would be false, but that would be wrong, since we only have a single value inside our dictionary, so no matter what the rand
value might be the if statement should return true and that's the natur of rand / ratio
.
To "fix" the randomly generated number based on the sum of probabilities we added. The rand / ratio
will only be usefull if you didn't provide probabilites that perfectly sum up to 1 = 100%
.
eg. if your example would be this
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
You can see that the provided probabilities equal to 1
=> 0.2 + 0.3 + 0.5
, in this case the line
if (rand / ratio <= tempDist)
Would look like this
if (rand / 1 <= tempDist)
Divding by 1 will never change the value and rand / 1 = rand
, so the only use case for this devision are cases where you didn't provided a perfect 100%
probability, could be either more or less.
As a side note, I would suggest changing your code to this
//call the dictionary distributions (notice the plural)
//dont use .Keys
//var distribution will be a KeyValuePair
foreach (var distribution in distributions)
{
//access the .Value member of the KeyValuePair
tempDist += distribution.Value;
if (rand / ratio <= tempDist)
{
return i;
}
}
Your test routine seems to be correctly translated.