Could someone please guide how to groupby no. of hours from hourly based index to find how many hours of null values are there in a specific month? Therefore, I am thinking of having a dataframe with monthly based index.
Below given is the dataframe which has timestamp as index and another column with has occassionally null values.
timestamp | rel_humidity |
---|---|
1999-09-27 05:00:00 | 82.875 |
1999-09-27 06:00:00 | 83.5 |
1999-09-27 07:00:00 | 83.0 |
1999-09-27 08:00:00 | 80.6 |
1999-09-27 09:00:00 | nan |
1999-09-27 10:00:00 | nan |
1999-09-27 11:00:00 | nan |
1999-09-27 12:00:00 | nan |
I tried this but the resulting dataframe is not what I expected.
gap_in_month = OG_1998_2022_gaps.groupby(OG_1998_2022_gaps.index.month, OG_1998_2022_gaps.index.year).count()
I always struggle with groupby in function. Therefore, highly appreciate any help. Thanks in advance!
If need 0
if no missing value per month create mask by Series.isna
, convert DatetimeIndex
to month periods by DatetimeIndex.to_period
and aggregate sum
- True
s of mask are processing like 1
or alternative with Grouper
:
gap_in_month = (OG_1998_2022_gaps['rel_humidity'].isna()
.groupby(OG_1998_2022_gaps.index.to_period('m')).sum())
gap_in_month = (OG_1998_2022_gaps['rel_humidity'].isna()
.groupby(pd.Grouper(freq='m')).sum())
If need only matched rows solution is similar, but first filter by boolean indexing
and then aggregate counts by GroupBy.size
:
gap_in_month = (OG_1998_2022_gaps[OG_1998_2022_gaps['rel_humidity'].isna()]
.groupby(OG_1998_2022_gaps.index.to_period('m')).size())
gap_in_month = (OG_1998_2022_gaps[OG_1998_2022_gaps['rel_humidity'].isna()]
.groupby(pd.Grouper(freq='m')).size())