I am able to create quarterly and monthly PeriodIndex like so:
idx = pd.PeriodIndex(year=[2000, 2001], quarter=[1,2], freq="Q") # quarterly
idx = pd.PeriodIndex(year=[2000, 2001], month=[1,2], freq="M") # monthly
I would expect to be able to create a yearly PeriodIndex like so:
idx = pd.PeriodIndex(year=[2000, 2001], freq="Y")
Instead this throws the following error:
Traceback (most recent call last):
File ".../script.py", line 3, in <module>
idx = pd.PeriodIndex(year=[2000, 2001], freq="Y")
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/indexes/period.py", line 250, in __new__
data, freq2 = PeriodArray._generate_range(None, None, None, freq, fields)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/arrays/period.py", line 316, in _generate_range
subarr, freq = _range_from_fields(freq=freq, **fields)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/arrays/period.py", line 1160, in _range_from_fields
ordinals.append(libperiod.period_ordinal(y, mth, d, h, mn, s, 0, 0, base))
File "pandas/_libs/tslibs/period.pyx", line 1109, in pandas._libs.tslibs.period.period_ordinal
TypeError: an integer is required
It seems like something that should be very easy to do but yet I cannot understand what is going wrong. Can anybody help?
month
and year
are both required "fields" due to the current implementation (through pandas 1.5.1 at least). Most other field values will be configured with a default value, however, neither month
or year
will be defined if a value is not provided. Therefore, in this case, month will remain None
which causes the error shown
TypeError: an integer is required
Here is a link to the relevant section of the source code where default values are defined. Omitting the month field results in [None, None]
(in this case) which cannot be converted to a Periodindex.
A correct index can be built as follows.
idx = pd.PeriodIndex(year=[2000, 2001], month=[1, 1], freq='Y')
Resulting in:
PeriodIndex(['2000', '2001'], dtype='period[A-DEC]')
Depending on the number of years, it may also make sense to programmatically generate the list of months:
years = [2000, 2001]
idx = pd.PeriodIndex(year=years, month=[1] * len(years), freq='Y')
As an alternative, it may be easier to use to_datetime + to_period to create the Period index from a Datetime index instead (as it is already in a compatible form)
pd.to_datetime([2000, 2001], format='%Y').to_period('Y')
Resulting in the same PeriodIndex:
PeriodIndex(['2000', '2001'], dtype='period[A-DEC]')