How can I make it display the max and min with its relevant value? I saw an answer with lambda but I don't understand it. Please help. Thank you!
with open('life-expectancy.csv', "r") as life_expectancy:
next(life_expectancy)
country = []
codes = []
years = []
expectancies = []
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entity = split_data[0]
code = split_data[1]
year = (split_data[2])
expectancy = float(split_data[3])
country.append(split_data[0])
codes.append(split_data[1])
years.append(int(split_data[2]))
expectancies.append(float(split_data[3]))
This part here display individual max and min but they are not related to each other--expectancies, entity, and year.
print(f'The overall max life expectancy is: {max(expectancies):.2f} from {max(entity)} in {max(years)}')
print(f'The overall min life expectancy is: {min(expectancies)} from {min(entity)} in {min(years)}')
Indeed, the problem is when you call min(entity)
and max(entity)
, Python has no idea that you're talking about life expectancy. The minimum or maximum is based on the lexicographic ordering of the strings in entity
.
min
and max
offer an optional key
parameter which lets you define the ordering that should be used to determine the minimum or maximum element.
As a simple example, if we had an array strings
of strings and wanted to get the longest string, we could do:
max(strings, key=lambda x: len(x))
(Yes, I know we could do key=len
here but I'm trying to keep things simple and consistent with the rest of my answer.)
This tells Python that we want the maximum to based on the length of each string. The lambda tells Python what to do to each element of the array to determine its ordering.
Because you have a separate array for each column of your data, the only association we have between entities and life expectancies are their indices, i.e. the life expectancy of entity[i]
is expectancies[i]
. We will therefore need to find the index with the minimum and maximum life expectancy.
We can do this by:
# find the index of the entity with the minimum life expectancy
min_idx = min(range(0, len(entity)), key=lambda i: expectancies[i])
min_entity = entity[min_idx]
min_expectancy = expectancies[min_idx]
# same for maximum
max_idx = max(range(0, len(entity)), key=lambda i: expectancies[i])
max_entity = entity[max_idx]
max_expectancy = expectancies[max_idx]
As others have alluded to, it may be best to restructure your code so that you're storing related data together, or use a library such as Pandas.