Search code examples
heatmapunits-of-measurementheat

in python metpy.units not recognizing degC unit for temprature (calculatinh Heat Index)


i am calculating Heat Index (temprature, relative humidity) using python . showing error for units, thats why i am not able to calculate simple heat index

import pandas as pd
 from metpy.calc import heat_index
 from metpy import units

 wd = pd.read_csv(""D:/my data.csv")

 def calculate_heat_index(temprature, R_humidity):
     return heat_index(temprature \* units.degC, R_humidity \* units.percent)

 t = wd\['T2M_MAX'\]
 r = wd\['RH2M'\]

# calculate heat index

 wd\['HI'\] = calculate_heat_index(t, r)
 print(wd.head())

#my data structure

LON        LAT  YEAR  MM  ...  T2M_MIN  T2MDEW WS2M_MAX    RH2M
0  41.165321  82.919199  2002   1  ...   -19.74  -19.19     5.84   98.31
1  41.165321  82.919199  2002   1  ...   -19.67  -16.95     7.89  100.00
2  41.165321  82.919199  2002   1  ...   -13.06  -12.41     8.36   98.50
3  41.165321  82.919199  2002   1  ...   -11.19   -7.88    11.70   96.69
4  41.165321  82.919199  2002   1  ...    -7.26   -6.13    10.59   98.88

i want to add HI column in the same data and want to calculate Heat index for each row,


Solution

  • The cause of your Units error is because of your import statement on line 3. It should be

    import pandas as pd 
    from metpy.calc import heat_index
    from metpy.units import units
    

    There are two additional factors you'll need to take into account with your calculate_heat_index function after you import units:

    1. You are passing in a pd.Series (essentially, a column from your data frame) as your temperature and R_humidity arguments, but the heat_index function you're calling in metpy.calc operates on pint.Quantity objects.
    2. heat_index will return to you a pint.Quantity, not the pd.Series you are expecting to append onto your data frame as the new column 'HI'. Your .csv appears to want floating-point values, so you'll need to use .magnitude off of the returned quantity to strip it of its units.

    There are a couple of ways to go about fixing this, either packing/un-packing the Series outside of calling your function and passing it individual temperatures and relative humidities, or putting that logic inside of the function so that it operates on whole Series at a time like this:

    def calculate_heat_index(temperatures, rel_humidities):
        result = pd.Series(index=temperatures.keys())
        for k in temperatures.keys():
            result[k] = heat_index(temperatures[k] * units.degC, 
                rel_humidities[k] * units.percent, True).magnitude
        return result
    

    Renaming the parameters to be plural helps emphasize the point that these are Series of temperatures and relative humidities. Another good idea would be to use type annotations to reinforce that the function takes Series arguments and returns a Series result.

    The last bool argument passed to heat_index controls whether the calculated heat index masks out the value (substituting NaN in the resulting Series) when the heat index calculation is not applicable for the given temperature. Passing False will prevent these NaN values from appearing for this reason, although whether that applies to your data set I can't tell because it excludes any max temperatures.