Search code examples
pythonpandasdataframemulti-index

How can I only return the first product when using pd.multiIndex.from_product()? Or a better option


I am using the code below to make a new index for a dataframe.

pd.DataFrame(pd.MultiIndex.from_product([df['Key'],pd.date_range(start='20160101', end='20160301',freq='MS')],names=['key','year_month']))

Here is the current ouput:

0 (A, 2016-01-01 00:00:00) 1 (A, 2016-02-01 00:00:00) 2 (A, 2016-03-01 00:00:00) 3 (A, 2016-01-01 00:00:00) 4 (A, 2016-02-01 00:00:00) 5 (A, 2016-03-01 00:00:00) 6 (A, 2016-01-01 00:00:00) 7 (A, 2016-02-01 00:00:00) 8 (A, 2016-03-01 00:00:00) 9 (B, 2016-01-01 00:00:00) 10 (B, 2016-02-01 00:00:00) 11 (B, 2016-03-01 00:00:00) 12 (B, 2016-01-01 00:00:00) 13 (B, 2016-02-01 00:00:00) 14 (B, 2016-03-01 00:00:00) 15 (B, 2016-01-01 00:00:00) 16 (B, 2016-02-01 00:00:00) 17 (B, 2016-03-01 00:00:00)

How can I change this code so that I only return the first product? Is there a separate function or an option for from_product?

Desired output:

0 (A, 2016-01-01 00:00:00) 1 (A, 2016-02-01 00:00:00) 2 (A, 2016-03-01 00:00:00) 3 (B, 2016-01-01 00:00:00) 4 (B, 2016-02-01 00:00:00) 5 (B, 2016-03-01 00:00:00)


Solution

  • Try using unique

    pd.DataFrame(pd.MultiIndex.from_product([df['Key'].unique(),pd.date_range(start='20160101', end='20160301',freq='MS')],names=['key','year_month']))