Search code examples
pythonpandasdataframeconditional-operator

python pandas dataframe select rows with ternary operator


I have what has to be considered a syntax question. I am doing some data scrubbing. My data contains date time fields. The date is always present, but sometimes the time is missing. Like this:

3/1/2023 11:03 AM
4/8/2022 10:21 AM
7/19/2012
7/12/2021
4/16/2024 7:02 AM

So, if the full colon (:) is missing, I want to concatenate 00:00 AM.

I expected this to work:

dfAssetMeter_a['lastReadingDTFixed'] = dfAssetMeter_a['LASTREADINGDATE'].astype(str) if dfAssetMeter_a['LASTREADINGDATE'].astype(str).find(':') > 0 else dfAssetMeter_a['LASTREADINGDATE'].astype(str) + ' 00:00 AM'  

but it didn't work.

This works, however:

def fixDateTime(dateStr_in):
    return dateStr_in if dateStr_in.find(':') > 0 else dateStr_in + ' 12:00 AM'

dfAssetMeter_a['lastReadingDTFixed'] = dfAssetMeter_a['LASTREADINGDATE'].apply(fixDateTime)

Questions:

  • What would be the more correct “Python” way to do this?
  • Could someone provide the correct syntax for the first method?

Solution

  • You can use pandas.to_datetime with format='mixed'.

    import pandas as pd
    
    df = pd.DataFrame(
        {
            "LASTREADINGDATE": [
                "3/1/2023 11:03 AM",
                "4/8/2022 10:21 AM",
                "7/19/2012",
                "7/12/2021",
                "4/16/2024 7:02 AM",
            ]
        }
    )
    
    df["lastReadingDTFixed"] = pd.to_datetime(df["LASTREADINGDATE"], format="mixed")
    
         LASTREADINGDATE  lastReadingDTFixed
    0  3/1/2023 11:03 AM 2023-03-01 11:03:00
    1  4/8/2022 10:21 AM 2022-04-08 10:21:00
    2          7/19/2012 2012-07-19 00:00:00
    3          7/12/2021 2021-07-12 00:00:00
    4  4/16/2024 7:02 AM 2024-04-16 07:02:00