I'm currently trying to run a MonteCarlo simulation without replacement, using np.random.choice, but there are certain items in the list which cannot appear together. For example, if I have a list of five items: RO, MI, VE,NA, SI, and each loop produces a group of four items, RO can appear with VE, MI, or NA, but it cannot appear with SI, on each iteration. So a loop like: [RO,MI,VE,NA] is correct But not: [RO,MI,SI,NA] as SI appears in the same group as RO. This is the code I'm currently using (which is producing the incorrect grouping):
np.random.seed(42)
portfolio=[]
for i in range(0,10000):
port_gar = list(np.random.choice(cities, size=4, replace=False, p=cities_prob))
portfolio.append(port_gar)
print(portfolio)
I'm at a loss as to what would do and any help would be appreciated. Thanks!
I present two solutions, both of which make use of the random
module from the standard library. I recommend this approach over using numpy since you can't really take advantage of numpy here in any way, and especially because you're not even ending up with a numpy array at the end (you're only using numpy for random number generation).
Something you could do is have your cities which don't play nice in a separate group from the others:
import random
random.seed(42)
n = 10 # whatever total number of items you want per sample
num_samples = 10000 # total number of samples
free_cities = [...] # all cities in this group play nicely together
exclusive_cities = [...] # cities in this group cannot be grouped in a sample, so we will only ever choose one city at a time from this group
portfolio = []
for _ in range(num_samples):
free_sample = random.sample(free_cities, k=(n-1))
exclusive_sample = random.sample(exclusive_cities, k=1)
portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Here's a contrived example:
n = 6
num_samples = 10
free_cities = ["AW", "JB", "EX", "HZ", "MZ", "XQ", "KA", "MW", "WZ", "UD"]
exclusive_cities = ["RO", "VE", "SI", "NA"]
portfolio = []
for _ in range(num_samples):
free_sample = random.sample(free_cities, k=(n-1))
exclusive_sample = random.sample(exclusive_cities, k=1)
portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Output:
>>> portfolio
[['AW', 'EX', 'WZ', 'RO', 'MZ', 'JB'],
['RO', 'MZ', 'KA', 'WZ', 'XQ', 'EX'],
['AW', 'MW', 'VE', 'JB', 'WZ', 'XQ'],
['NA', 'WZ', 'JB', 'MW', 'EX', 'MZ'],
['SI', 'UD', 'AW', 'JB', 'WZ', 'MW'],
['MZ', 'JB', 'UD', 'VE', 'WZ', 'HZ'],
['NA', 'KA', 'UD', 'AW', 'MW', 'WZ'],
['JB', 'NA', 'UD', 'KA', 'WZ', 'MW'],
['MZ', 'RO', 'JB', 'HZ', 'AW', 'XQ'],
['WZ', 'HZ', 'UD', 'JB', 'VE', 'XQ']]
Now something to be aware of is that this will guarantee that every sample has exactly one city from the exclusive_cities
group:
for sample in portfolio:
print(set(sample) & set(exclusive_cities))
Output:
# Every sample has one item from the exclusive_cities group
{'RO'}
{'RO'}
{'VE'}
{'NA'}
{'SI'}
{'VE'}
{'NA'}
{'NA'}
{'RO'}
{'VE'}
If that's not your desired behavior, you can use an additional "coin flip" to determine if 1 or 0 exclusive cities show up in a given sample with this minor modification to the above logic:
n = 6
num_samples = 10
portfolio = []
for _ in range(num_samples):
include = random.choice((0, 1))
free_sample = random.sample(free_cities, k=(n - include))
exclusive_sample = random.sample(exclusive_cities, k=include)
portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Now your samples may or may not include an exclusive city, but still only ever one max:
# An empty set means an exclusive city isn't present in that sample
set()
set()
{'SI'}
{'NA'}
set()
{'RO'}
{'SI'}
{'VE'}
set()
{'RO'}