Search code examples
pythondataframemerge

Horizontally merge dataframes while simultaneously making columns unique values


I have a for loop being used to fetch dataframes of data ( I provide a list of stocks and it fetches a dataframe of data for each stock. I do this by using a dictionary

d={}
fullset= pd.DataFrame ()
for name in stocklist:

Next, I am able to merge the data set horizontally using the following:


fullset= pd.concat([fullset, d[name]], axis=1)

The issue I have is that the data sets all have the same column headers. Is there something i can do in this merge so that the "name" variable is appended to each header for it's particular dataframe? For example, SPX-High, SPX-Low, SPX-Close....Dow-High, Dow-Low, Dow-Close

*note: I know there is a hierarchical option (basically adding a second index) but I was not able to get it working, and even if I did, my preference is to append the stock name to each of the columns because I am running ML models on top of the data, and I don't know how it will do with the hierarchical headers.

Among other things, I tried creating a keys parameter using keys=d[name] but that did not work. I also tried with just the variable "name" though I didn't expect that to work unless concat is running some sort of for loop behind the scenes.


Solution

  • We can add a prefix to the column names using the pandas.DataFrame.add_prefix method.

    fullset = pd.DataFrame({})
    for name in stocklist:
        fullset = pd.concat([
            fullset, 
            d[name].add_prefix(f"{name}-")
        ], axis=1)
    
    print(fullset)