Search code examples
pythonpython-3.xpandasdateseq

Creating a sequence of dates for each group in Python 3


I have a dataset that consists of multiple stores/retailers:

RETAILER    VOLUME  DISP    PRICE
store1      12      15      10
store1      10      8       17
store1      12      13      12
...
store2      22      22      30
store2      17      14      22
store2      23      18      18
...
store3      11      13      10
store3      12      13      13
store3      14      12      11

Unfortunately this data set does not contain dates so I need to generate date sequences for each store within this table. Frequency isn't really that important - days or months will be ok.

To make it easier let's assume that it starts on 2000-01-01 and number of dates for each RETAILER depends on number of observations for this RETAILER.

I'm able to handle this in R:

df <- df %>% 
  arrange(RETAILER) %>% 
  group_by(RETAILER) %>%  
  mutate(dates = seq(as.Date("2000-01-01"), by = "month", length.out = n()))

However, I'm unable to replicate this in Python. I've been trying with pd.date_range(), but can't find the solution.

Can anyone please suggest some way to perform this task in Python?

The desired output:

RETAILER    VOLUME  DISP    PRICE   DATE
store1      12      15      10      2000-01-01
store1      10      8       17      2000-02-01
store1      12      13      12      2000-03-01
...
store2      22      22      30      2000-01-01
store2      17      14      22      2000-02-01
store2      23      18      18      2000-03-01
...
store3      11      13      10      2000-01-01
store3      12      13      13      2000-02-01
store3      14      12      11      2000-03-01

Solution

  • Let us group the dataframe on RETAILER and use cumcount to create sequential counter per RETAILER then map this counter to MonthBegin offset and add a Timestamp('2000-01-01'):

    c = df.groupby('RETAILER').cumcount()
    df['DATE'] = pd.Timestamp('2000-01-01') + c.map(pd.offsets.MonthBegin)
    

      RETAILER  VOLUME  DISP  PRICE       DATE
    0   store1      12    15     10 2000-01-01
    1   store1      10     8     17 2000-02-01
    2   store1      12    13     12 2000-03-01
    3   store2      22    22     30 2000-01-01
    4   store2      17    14     22 2000-02-01
    5   store2      23    18     18 2000-03-01
    6   store3      11    13     10 2000-01-01
    7   store3      12    13     13 2000-02-01
    8   store3      14    12     11 2000-03-01