Search code examples
pythonpandasdataframe

python pandas modify column


I have what has to be considered a syntax question. Doing some data scrubbing where there are numeric fields, that need to be tidied up.

Like this:

20,956
1,006,486
0
0
85,186
77,573

Sometimes there are commas, sometime multiple commas and sometime none.

So my simple mind says, find and replace will fix this.

This works:

def fixComma(strIn):

    return  strIn.replace(","  , ""   )  


dfAssetMeter_a['lastReadingflt']      =  dfAssetMeter_a['LASTREADING'].apply(fixComma).astype(float)

Questions: Is this “Python” way to do this?

Would somebody be kind enough to provide the correct syntax for removing the commas in line.

I thought something like this would work.

     =  dfAssetMeter_a['LASTREADING']. .replace(","  , ""   ) .astype(float)

Solution

  • You can use the str accessor along with replace, and pandas.to_numeric:

    import pandas as pd
    
    df = pd.DataFrame(
        {
            "LASTREADING": [
                "20,956",
                "1,006,486",
                "0",
                "0",
                "85,186",
                "77,573",
            ]
        }
    )
    
    df["lastReadingflt"] = pd.to_numeric(df["LASTREADING"].str.replace(",", ""))
    
      LASTREADING  lastReadingflt
    0      20,956           20956
    1   1,006,486         1006486
    2           0               0
    3           0               0
    4      85,186           85186
    5      77,573           77573