Search code examples
pythonpandasvalueerror

How should I convert numbers in a column to the same format style in Python?


I have two columns Lat and Long in a data frame. I try to convert these strings into float but I have the following error:

ValueError: Unable to parse string "𝟒" at position 61754

I've noticed that in my data frame I have numbers written in different styles, even in bold text. I'm wondering if there is a way to convert the numbers in the same style.


Solution

  • Suppose you have this DataFrame:

      funny_numbers
    0             𝟒
    1             𝟝
    2            𝟳𝟨
    

    You can try .str.normalize to convert the Unicode characters to standard form:

    df["funny_numbers"] = df["funny_numbers"].str.normalize("NFKD")
    df["funny_numbers"] = pd.to_numeric(df["funny_numbers"])
    print(df)
    

    Prints:

      funny_numbers
    0             4
    1             5
    2            76