Search code examples
pythonpandasstringstrsplit

why str split is not working as expected?


I have data frame column as -

Date
2025-01-21 00:00:00
2021-12-05 00:00:00
12-MAY-2020/18-SEP-2020
15-JUN-2021/20-JUL-2021
2020-12-05 00:00:00

I am using following code to extract the first date from the dates separated by "/"-

df["Date2"] = df["Date"].str.split('/', expand=True)[0]

I am expecting a output of -

Date2
2025-01-21 00:00:00
2021-12-05 00:00:00
12-MAY-2020
15-JUN-2021
2020-12-05 00:00:00

But, output is as follows-

Date2
nan
nan
12-MAY-2020
15-JUN-2021
nan

Why is this happening?


Solution

  • It was possibly due to the different datatypes present in 'Date' column:

    so use astype() to ensure the type:

    df["Date2"] = df["Date"].astype(str).str.split('/', expand=True)[0]
    #OR
    df["Date2"] = df["Date"].astype(str).str.split('/').str[0]
    

    output of df:

        Date                        Date2
    0   2025-01-21 00:00:00         2025-01-21 00:00:00
    1   2021-12-05 00:00:00         2021-12-05 00:00:00
    2   12-MAY-2020/18-SEP-2020     12-MAY-2020
    3   15-JUN-2021/20-JUL-2021     15-JUN-2021
    4   2020-12-05 00:00:00         2020-12-05 00:00:00
    

    Note: you can check the output of print(df['Date'].map(type).value_counts()) to verify that