i am calculating Heat Index (temprature, relative humidity) using python . showing error for units, thats why i am not able to calculate simple heat index
import pandas as pd
from metpy.calc import heat_index
from metpy import units
wd = pd.read_csv(""D:/my data.csv")
def calculate_heat_index(temprature, R_humidity):
return heat_index(temprature \* units.degC, R_humidity \* units.percent)
t = wd\['T2M_MAX'\]
r = wd\['RH2M'\]
# calculate heat index
wd\['HI'\] = calculate_heat_index(t, r)
print(wd.head())
#my data structure
LON LAT YEAR MM ... T2M_MIN T2MDEW WS2M_MAX RH2M
0 41.165321 82.919199 2002 1 ... -19.74 -19.19 5.84 98.31
1 41.165321 82.919199 2002 1 ... -19.67 -16.95 7.89 100.00
2 41.165321 82.919199 2002 1 ... -13.06 -12.41 8.36 98.50
3 41.165321 82.919199 2002 1 ... -11.19 -7.88 11.70 96.69
4 41.165321 82.919199 2002 1 ... -7.26 -6.13 10.59 98.88
i want to add HI column in the same data and want to calculate Heat index for each row,
The cause of your Units error is because of your import statement on line 3. It should be
import pandas as pd
from metpy.calc import heat_index
from metpy.units import units
There are two additional factors you'll need to take into account with your calculate_heat_index
function after you import units:
pd.Series
(essentially, a column from your data frame) as your temperature
and R_humidity
arguments, but the heat_index
function you're calling in metpy.calc
operates on pint.Quantity
objects.heat_index
will return to you a pint.Quantity
, not the pd.Series
you are expecting to append onto your data frame as the new column 'HI'. Your .csv appears to want floating-point values, so you'll need to use .magnitude
off of the returned quantity to strip it of its units.There are a couple of ways to go about fixing this, either packing/un-packing the Series outside of calling your function and passing it individual temperatures and relative humidities, or putting that logic inside of the function so that it operates on whole Series
at a time like this:
def calculate_heat_index(temperatures, rel_humidities):
result = pd.Series(index=temperatures.keys())
for k in temperatures.keys():
result[k] = heat_index(temperatures[k] * units.degC,
rel_humidities[k] * units.percent, True).magnitude
return result
Renaming the parameters to be plural helps emphasize the point that these are Series
of temperatures and relative humidities. Another good idea would be to use type annotations to reinforce that the function takes Series
arguments and returns a Series
result.
The last bool argument passed to heat_index
controls whether the calculated heat index masks out the value (substituting NaN in the resulting Series) when the heat index calculation is not applicable for the given temperature. Passing False will prevent these NaN values from appearing for this reason, although whether that applies to your data set I can't tell because it excludes any max temperatures.