Search code examples
pythonpandasquantitative-finance

start dates across multiple timeframe to collecting market data


I'm writing a trading algorithm that incorporates multiple timeframe analysis by comparing yearly OHLC to quarterly OHLC to monthly OHLC to weekly OHLC to daily OHLC.

For each of these different timeframe, I need the start date for the previous 4 periods of OHLC data, which includes the current period.

Examples, relative to 2022-01-09, the date of this post:

Yearly timeframe 4 year start date: 2019-01-02

Quarterly timeframe 4 quarter start date: 2021-03-01

Monthly timeframe 4 month start date: 2021-10-01

Weekly timeframe 4 week start date: 2021-12-13

Daily timeframe 4 day start date: 2022-01-04

Is there a finance markets-centric module for python that allows easy referencing to these kinds of date stamps that I can ultimately output as strings in the format of the aforementioned examples to insert into my polygon REST calls?

I have looked at various datetime and dateutil classes and methods, but they're all rather complex, notwithstanding they don't appear to account for nuances like days that the North American equities markets are open.

(It'd be super cool if something existed that could understand English language expression; something like: 'first market day of the year four years ago', 'first market day of the quarter, four quarters ago', 'first market day of the month, four months ago', 'first market day of the week, four weeks ago', 'day of the week, four market days ago').

Solution

pandas has a very comprehensive list of features that provides what I was looking for. Here's what I put together based on those findings:

# timefunctions.py
import pandas
import time
import pandas_market_calendars as mcal

now = pandas.Timestamp.now()
tz = 'America/New York'
nyse = mcal.get_calendar('NYSE')

def start_of_year(multiplier, format='ms'):
    obj = now + pandas.tseries.offsets.BYearBegin(multiplier)
    obj = pandas.Timestamp(obj).floor(freq='D')
    if format == 'ms':
        obj = int(time.mktime(obj.timetuple()) * 1000)
    return(obj)

def start_of_quarter(multiplier, format='ms'):
    obj = now + pandas.tseries.offsets.BQuarterBegin(multiplier, startingMonth=1)
    obj = pandas.Timestamp(obj).floor(freq='D')
    if format == 'ms':
        obj = int(time.mktime(obj.timetuple()) * 1000)
    return(obj)

def start_of_month(multiplier, format='ms'):
    obj = now + pandas.tseries.offsets.BusinessMonthBegin(multiplier)
    obj = pandas.Timestamp(obj).floor(freq='D')
    if format == 'ms':
        obj = int(time.mktime(obj.timetuple()) * 1000)
    return(obj)

def start_of_week(multiplier, format='ms'):
    obj = now + pandas.tseries.offsets.Week(multiplier, weekday=0)
    obj = pandas.Timestamp(obj).floor(freq='D')
    if format == 'ms':
        obj = int(time.mktime(obj.timetuple()) * 1000)
    return(obj)

def start_of_day(multiplier, format='ms', action='signal'):
    """
    : action = 'signal' [default] (return timestamps of all trading days with market open times)
    : action = 'trigger' (return timestamps of today only) (ignores multiplier. hard coded to -1)
    : format = 'ms' [default] (return object in millisecond timestamps)
    """
    market_days = nyse.schedule(start_date=start_of_week(-1, None), end_date=now)
    if action == 'trigger':
        obj = market_days.index.to_list()[-1:][0]
        obj = obj + pandas.DateOffset(hour=nyse.open_time.hour, minute=nyse.open_time.minute, second=nyse.open_time.second)
        if format == 'ms':
            obj = int(time.mktime(obj.timetuple()) * 1000)
    elif action == 'signal':
        date_list = []
        date_list_ms = []
        for date_item in market_days.index.to_list()[multiplier:]:
            date_list.append(date_item + pandas.DateOffset(hour=nyse.open_time.hour, minute=nyse.open_time.minute, second=nyse.open_time.second))
        if format == 'ms':
            for date in date_list:
                date_list_ms.append(int(time.mktime(date.timetuple()) * 1000))
        if date_list:
            obj = date_list
        if date_list_ms:
            obj = date_list_ms
    return(obj)

output:

In [1]: from modules import timefunctions

In [2]: timefunctions.start_of_year(-4, None)
Out[2]: Timestamp('2019-01-01 00:00:00')

In [3]: timefunctions.start_of_year(-4)
Out[3]: 1546318800000

In [4]: timefunctions.start_of_quarter(-4, None)
Out[4]: Timestamp('2021-04-01 00:00:00')

In [5]: timefunctions.start_of_quarter(-4)
Out[5]: 1617249600000

In [6]: timefunctions.start_of_month(-4, None)
Out[6]: Timestamp('2021-10-01 00:00:00')

In [7]: timefunctions.start_of_month(-4)
Out[7]: 1633060800000

In [8]: timefunctions.start_of_week(-4, None)
Out[8]: Timestamp('2021-12-27 00:00:00')

In [9]: timefunctions.start_of_week(-4)
Out[9]: 1640581200000

In [10]: timefunctions.start_of_day(-4, None)
Out[10]: 
[Timestamp('2022-01-18 09:30:00'),
 Timestamp('2022-01-19 09:30:00'),
 Timestamp('2022-01-20 09:30:00'),
 Timestamp('2022-01-21 09:30:00')]

In [11]: timefunctions.start_of_day(-4)
Out[11]: [1642516200000, 1642602600000, 1642689000000, 1642775400000]

In [12]: 

Solution

  • Updated original question with solution using pandas