Search code examples
pythonmaxmin

How do I find the Max or Min value from this dataset in Python?


I am using a online dataset for global life expectancy and I am trying to find the max and min number in the life_expectancy column.

Here is the dataset: https://ourworldindata.org/spanish-flu-largest-influenza-pandemic-in-history

This is what I have after trying math equations and max() and min() as suggested in other posts.

with open('data/life-expectancy.csv') as life_expectancy:
    next(life_expectancy)
    for data in life_expectancy:
        clean_data = data.strip()
        split_data = clean_data.split(',')

        entity = split_data[0]
        code = split_data[1]
        year = split_data[2]
        expectancy = float(split_data[3])
              
print(f'The overall max life expectancy is: {max(split_data[3])}')
print(f'The overall min life expectancy is: {min(split_data[3])}')

What else should I add to actually get proper results?

Current Output:

The overall max life expectancy is: 9
The overall min life expectancy is: .

Solution

  • You are not doing anything with the data you are iterating over.

    When you have your data stored in a list, we can use min and max on the dataset. Using a key and lambda we can ensure our result includes all relevant data instead of just storing the maximum value.

    with open('life-expectancy.csv') as life_expectancy:
        next(life_expectancy)
        
        ## Create an empty list
        output = []
        
        for data in life_expectancy:
            clean_data = data.strip()
            split_data = clean_data.split(',')
    
            entity = split_data[0]
            code = split_data[1]
            year = split_data[2]
            expectancy = float(split_data[3])
          
            ## Append to the list
            output.append([entity, code, year, expectancy])
    
    max_life = max(output, key=lambda x: x[3])
    min_life = min(output, key=lambda x: x[3])
    
    #['Monaco', 'MCO', '2019', 86.751]
    #['Iceland', 'ISL', '1882', 17.76]
    
    print(f'The overall max life expectancy is {max_life[3]} in {max_life[0]}')    
    print(f'The overall min life expectancy is {min_life[3]} in {min_life[0]}')
    
    #The overall max life expectancy is 86.751 in Monaco
    #The overall min life expectancy is 17.76 in Iceland
    

    To improve readability, you can store the data as a list of `dicts by modifying the following lines

    output.append({'entity': entity, 'code': code, 'year': year, 'expectancy': expectancy})
    
    max_life = max(output, key=lambda x: x['expectancy'])
    min_life = min(output, key=lambda x: x['expectancy'])
    
    print(f'The overall max life expectancy is {max_life["expectancy"]} in {max_life["entity"]}')
    print(f'The overall min life expectancy is {min_life["expectancy"]} in {min_life["entity"]}')