I'm tryng to print out the states with highest population in each region.
Code Sample:
# all unique regions
region_unique = data['Region'].unique()
# highest population
max_pop = data['population'].max()
How can I chain the above lines of code and bring in the 'States'
column to achieve my result?
Dataset:
Considering you haven't mentioned any library...
You could first create a helper dict, mapping each region to an array of states. Each state is a tuple: (state, pop)
(name and population count):
regions = {}
for state, pop, region in zip(data['States'], data['population'], data['Region']):
res.setdefault(region, []).append((state, pop))
Then for each region you can pull out the most inhabited state:
for region, states in regions.items():
print(region, max(states, key=lambda _, pop: pop))
To states under each region with a population less than 100, you can do:
for region, states in regions.items():
print(region, list(filter(lambda state: state[1] > 100, states)))