Search code examples
pythonpandastime-seriesdata-cleaning

Turn 2010 Q1 to datetime as 2010-3-31


enter image description here

How to find a smart solution to turn Year_Q to datetime? I tried to use

pd.to_datetime(working_visa_nationality['Year_Q'])

but got error says that this cannot be recognized. So I tried a stupid way as:

working_visa_nationality['Year'] = working_visa_nationality.Year_Q.str.slice(0,4)
working_visa_nationality['Quarter'] = working_visa_nationality.Year_Q.str.slice(6,8)

enter image description here

And now I found a problem: it is true that I can groupby data by the year, but it is difficult to include the quarter to my line plot.

So how to make 2010 Q1 like 2010-3-31?


Solution

  • I a bit changed MaxU answer:

    df = pd.DataFrame({'Year_Q': ['2010 Q1', '2015 Q2']})
    
    df['Dates']  = pd.PeriodIndex(df['Year_Q'].str.replace(' ', ''), freq='Q').to_timestamp()
    print (df)
        Year_Q      Dates
    0  2010 Q1 2010-01-01
    1  2015 Q2 2015-04-01
    

    EDIT:

    df['Dates']  = pd.PeriodIndex(df['Year_Q'].str.replace(' ', ''), freq='Q').to_timestamp(how='e')
    print (df)
        Year_Q      Dates
    0  2010 Q1 2010-03-31
    1  2015 Q2 2015-06-30