Search code examples
pythonpython-3.xpandasdatetimedatetimeoffset

Using Pandas Date Offset with Offset Aliases Directly


I'm sure I'm just not searching the correct phrase to answer this, but in Pandas, one can take any DateTime object and add or subtract from it using DateOffset like:

pd.datetime.now() + pd.DateOffset(months=2)

pd.datetime.now() + pd.DateOffset(weeks=4)

pd.datetime.now() + pd.DateOffset(days=2)

etc.

But in different context, Offset Aliases are also used: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases

My question is is there a way to use DateOffset and pass an Offset Alias to determine the Offset basis?

Something like: pd.datetime.now() + pd.DateOffset(n=some_int, freq='W') for some_int weeks as an example?

If that isn't possible, where should I dig in to get something to that effect?

(The reason is because there is a function I'm utilizing the Offset Alias and I don't want to have to create a long if...else statement to convert the freq string to different instances of pd.DateOffset(weeks=n | years=n | months=n | etc.) I rather the freq string dictate the DateOffset in one line of code and still be dynamic to handle the different frequencies of time.)

Edited: to add a custom function that takes care of what I need, but it would be nice to have a solution in to_timedelta or DateOffset so that the solution is upstream and more efficient. For example, I'd love to utilize the Business versions of all the freq parameters I'm using so that the n can take in information more natural and raw to the source it came from.

def datedelta(date, n=0, freq='M'):
    from pandas import to_datetime, DateOffset
    
    if n == 0:
        date_sign = 1
    else:
        date_sign = np.abs(n)/n
        
    freq = freq.lower()
    
    if freq == 'y':
        dtOff = DateOffset(years=abs(n))
    elif freq == 'q':
        dtOff = DateOffset(quarters=abs(n))
    elif freq == 'm':
        dtOff = DateOffset(months=abs(n))
    elif freq == 'w':
        dtOff = DateOffset(weeks=abs(n))
    elif freq == 'd':
        dtOff = DateOffset(days=abs(n))
    else:
        raise ValueError("The freq parameter not one of the following: {'Y', 'Q', 'M', 'W', 'D'}")
        
    return to_datetime(date) + date_sign * dtOff

Solution

  • Maybe this function would be helpful?

    def offset(freq:str, n:int = 0):
        """
        Pandas DateOffset wrapper
        import pandas as pd
        """
        # =============================================================================
        # offsets_df = pd.read_html('https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html')[2]
        # offsets_df.drop(columns = ['Description'], inplace = True)
        # offsets_df.replace('None', np.NaN, inplace = True)
        # offsets_df.dropna(subset = ['Frequency String'], inplace = True)
        # offsets_df['Frequency String'] = offsets_df['Frequency String'].str.replace("\'", '')
        # for x in ['Date Offset', 'Frequency String']:
        #    offsets_df[x] =  offsets_df[x].str.split(' or ')
        # offsets_df['Date Offset'] = offsets_df['Date Offset'].map(lambda x: x[0])
        # offsets_df = explode_rows(offsets_df, 'Frequency String', fill_value = '')
        # offsets_df.drop_duplicates(subset = ['Frequency String'], inplace = True)
        # offsets_d = dict(zip(offsets_df['Frequency String'], offsets_df['Date Offset']))
        # =============================================================================
        offsets_d = {'B': 'BDay',
                     'C': 'CDay',
                     'W': 'Week',
                     'WOM': 'WeekOfMonth',
                     'LWOM': 'LastWeekOfMonth',
                     'M': 'MonthEnd',
                     'MS': 'MonthBegin',
                     'BM': 'BMonthEnd',
                     'BMS': 'BMonthBegin',
                     'CBM': 'CBMonthEnd',
                     'CBMS': 'CBMonthBegin',
                     'SM': 'SemiMonthEnd',
                     'SMS': 'SemiMonthBegin',
                     'Q': 'QuarterEnd',
                     'QS': 'QuarterBegin',
                     'BQ': 'BQuarterEnd',
                     'BQS': 'BQuarterBegin',
                     'REQ': 'FY5253Quarter',
                     'A': 'YearEnd',
                     'AS': 'YearBegin',
                     'BYS': 'YearBegin',
                     'BA': 'BYearEnd',
                     'BAS': 'BYearBegin',
                     'RE': 'FY5253',
                     'BH': 'BusinessHour',
                     'CBH': 'CustomBusinessHour',
                     'D': 'Day',
                     'H': 'Hour',
                     'T': 'Minute',
                     'min': 'Minute',
                     'S': 'Second',
                     'L': 'Milli',
                     'ms': 'Milli',
                     'U': 'Micro',
                     'us': 'Micro',
                     'N': 'Nano'}
        return eval(f'pd.offsets.{offsets_d[freq]}({n})')