Search code examples
pythonpandasclassobjectpython-dataclasses

Create class and prep data once that all objects can then use


Forgive my terminology, I am still learning Python and OOP.

I have 2 py files:

data.py

main.py

within main.py, I want to create n objects using something like:

USD_ON_SOFR = rate_interp('USD', 'ON', 'SOFR', '2023-01-26')

Above is for USD, but it could be for n other currencies and different interest rate curves - potentially 15-20 individual objects.

The objective is to return an interpolated value using (for example):

ans = USD_ON_SOFR.interp(45)

the idea here is that data.py is in charge of creating the objects. I want it to pull data ONCE from a SQL db, eg:

prices_df = pd.read_sql('select * from rates_prices', db_conn)

I then want each object to use the initial data [prices_df] without re-querying the SQL db. I know how to have each object slice the df up as necessary.

My query is, how do I set up the class to pull that data initially upon first object creation, once, and make it available to each object each time a different object is created?

Here is some of the code I tried, I had thought to use __new__:

in data.py:

class rate_interp():
        def __new__(cls, *args, **kwargs):
            db_conn = db.connect(r'C:\Users\user\DataGripProjects\default\identifier.sqlite')
            prices = pd.read_sql('select * from rates_prices', db_conn)

        def __init__(self,cls,ccy,type1,type2,pricing_date):
            self.curve_ID = ccy + '_' + type1 + '_' + type2
            self.pricing_date = pricing_date
            self.df_cut = cls.new_query.loc[cls.new_query['curve_family']==type2+type1+ccy+pricing_date]
            self.interp = interpolate.interp1d(self.df_cut['days'],self.df_cut['LAST_PRICE'])`

in main.py:

USD_ON_SOFR = rate_interp('USD', 'ON', 'SOFR', '2023-01-26')

ans = USD_ON_SOFR.interp(45).item() #45 is just an example

the error:

ans = USD_ON_SOFR.interp(45).item()
AttributeError: 'NoneType' object has no attribute 'interp'

Solution

  • You could just make a cached function that makes your db call in data.py

    import functools
    import pandas as pd
    
    @functools.cache
    def get_data_from_db():
        db_conn = db.connect(r'C:\Users\user\DataGripProjects\default\identifier.sqlite')
        return pd.read_sql('select * from rates_prices', db_conn)
    

    This way it will only make the db call the first time you call the function; subsequent calls will not. Then you can just call this in your constructor:

    class rate_interp:
        def __init__(*args, **kwargs):
            price_data = get_data_from_db() 
            ...