Search code examples
pythonpandasdataframeindexingmulti-index

Reindex a dataframe using tickers from another dataframe


I read csv files into dataframe using

from glob import glob
import pandas as pd

def read_file(f):
    df = pd.read_csv(f)
    df['ticker'] = f.split('.')[0]
    return df


df = pd.concat([read_file(f) for f in glob('*.csv')])
df = df.set_index(['Date','ticker'])[['Close']].unstack()

And got the following dataframe:

                 Close                                                    
ticker            AAPL        AMD        BIDU        GOOGL          IXIC   
Date                                                                       
2011-06-01   12.339643   8.370000  132.470001   263.063049   2769.189941
.
.
.   

now I would like to use the 'ticker' to reindex another random dataframe created by

data = np.random.random((df.shape[1], 100))
df1 = pd.DataFrame(data)

which looks like:

          0         1         2         3         4         5         6   \...
0   0.493036  0.114539  0.862388  0.156381  0.030477  0.094902  0.132268   
1   0.486184  0.483585  0.090874  0.751288  0.042761  0.150361  0.781567   
2   0.318586  0.078662  0.238091  0.963334  0.815566  0.274273  0.320380   
3   0.708489  0.354177  0.285239  0.565553  0.212956  0.275228  0.597578   
4   0.150210  0.423037  0.785664  0.956781  0.894701  0.707344  0.883821   
5   0.005920  0.115123  0.334728  0.874415  0.537229  0.557406  0.338663   
6   0.066458  0.189493  0.887536  0.915425  0.513706  0.628737  0.132074   
7   0.729326  0.241142  0.574517  0.784602  0.287874  0.402234  0.926567   
8   0.284867  0.996575  0.002095  0.325658  0.525330  0.493434  0.701801   
9   0.355176  0.365045  0.270155  0.681947  0.153718  0.644909  0.952764   
10  0.352828  0.557434  0.919820  0.952302  0.941161  0.246068  0.538714   
11  0.465394  0.101752  0.746205  0.897994  0.528437  0.001023  0.979411   

I tried

df1 = df1.set_index(df.columns.values)

but it seems my df only has one level of index since the error says

IndexError: Too many levels: Index has only 1 level, not 2

But if I check the index by df.index it gives me the Date, can someone help me solve this problem?


Solution

  • You can get the column labels of a particular level of the MultiIndex in df by MultiIndex.get_level_values, as follows:

    df_ticker = df.columns.get_level_values('ticker')
    

    Then, if df1 has the same number of columns, you can copy the labels extracted to df1 by:

    df1.columns = df_ticker