Search code examples
pythonpython-3.xpandaspython-datetime

convert datetime column to a zero-based months. 0 is Jan and 11 is Dec


I have a datetime column that I want to 'stringify' it using strftime, the problem is I want the months to be zero-based i.e. 0=January, 11=December.

What I've tried is after I 'stringified' the column and called str.replace on it by passing a regex and a callable to convert the month to a number and subtract one from it and then convert it back to a string

Why do I want it to be zero-based? because this data going to be consumed by Google Charts and it requires date represented as string to be zero-based

here is the code, is there a better solution?

month_regex = r",(0[1-9]|1[0-2])"
#                vvv  ->  month_regex
format = "Date(%Y,%m,%d,%H,%M,%S)"

print(df["start"].dtype)  # float64   represents an epoch

# convert epoch to datetime and then to string with the given format
df["start"] = pd.to_datetime(df["start"]//1000, unit="s").dt.strftime(format)

print(df["start"])  # Date(2022,05,24,00,00,00)

df["start"] = df["start"].str.replace(
    month_regex,
    lambda match: "," + str(int(match[0][1:]) - 1),
    1,  # first occurrence only
    regex=True)

print(df["start"])  # Date(2022,4,24,00,00,00)


Solution

  • Simply use string formatting to achieve the same result.

    df = pd.to_datetime(pd.Series(["2022-01-01"]))
    
    # We extract the month as integer and subtract one from it. Then do string formatting
    df.apply(lambda x: x.strftime("Date(%Y,%i,%d,%H,%M,%S)") % (x.month-1))