Search code examples
pandasasynchronousapply

How to apply asynchronous calls to API with Pandas apply() function


I have a ~14,000 row dataframe and attempting to fill in some data into a new column by calling an API. The code below retrieves the expected response, however, it seems each iteration waits for a response to go to the next row.

Here is the function:

def market_sector_des(isin):
isin = '/isin/' + isin
return blp.bdp(tickers = isin, flds = ['market_sector_des']).iloc[0]

I am using xbbg to call the Bloomberg API.

The .apply() function returns the expected response,

df['new_column'] = df['ISIN'].apply(market_sector_des)

but each response takes around 2 seconds, which at 14,000 lines is roughly 8 hours.

Is there any way to make this apply function asynchronous so that all requests are sent in parallel? I have seen dask as an alternative, however, I am running into issues using that as well.


Solution

  • If the above is exactly what you want to do, then it can be achieved by creating a column which contains the ticker syntax to be sent, and then pass that column as a series through blpapi

    df['ISIN_NEW'] = '/isin/' + df['ISIN']
    isin_new = pd.unique(df['ISIN_NEW'].dropna())
    mktsec_df = blp.bdp(tickers = isin_new, flds = ['market_sector_des'])
    

    You can then join the newly created df to your existing df, so that you get the figures in columns intact.

    newdf = pd.merge(df, mktsec_df, how='left', left_on = 'ISIN_NEW', right_index = True )
    

    This should result in a single call, which would ideally drop the speed to less than a minute. Do let me know if this works out.