Search code examples
pythonpandasstringtype-conversiontypeerror

Convert Float to String in Pandas


I have a dataframe with the following dtypes.

> df.dtypes
    Col1         float64
    Col2          object
    dtype: object

When I do the following:

df['Col3']  = df['Col2'].apply(lambda s: len(s) >= 2  and s[0].isalpha())

I get:

TypeError: object of type 'float' has no len()

I believe if I convert "object" to "String", I will get to do what I want. However, when I do the following:

df['Col2'] = df['Col2'].astype(str)

the dtype of Col2 doesn't change. I am a little confused with datatype "object" in Pandas. What exactly is "object"?

More info: This is how Col2 looks like:

               Col2
1                F5
2               K3V
3                B9
4               F0V
5             G8III
6              M0V:
7                G0
8      M6e-M8.5e Tc

Solution

  • If a column contains string or is treated as string, it will have a dtype of object (but not necessarily true backward -- more below). Here is a simple example:

    import pandas as pd
    df = pd.DataFrame({'SpT': ['string1', 'string2', 'string3'],
                       'num': ['0.1', '0.2', '0.3'],
                       'strange': ['0.1', '0.2', 0.3]})
    print df.dtypes
    #SpT        object
    #num        object
    #strange    object
    #dtype: object
    

    If a column contains only strings, we can apply len on it like what you did should work fine:

    print df['num'].apply(lambda x: len(x))
    #0    3
    #1    3
    #2    3
    

    However, a dtype of object does not means it only contains strings. For example, the column strange contains objects with mixed types -- and some str and a float. Applying the function len will raise an error similar to what you have seen:

    print df['strange'].apply(lambda x: len(x))
    # TypeError: object of type 'float' has no len()
    

    Thus, the problem could be that you have not properly converted the column to string, and the column still contains mixed object types.

    Continuing the above example, let us convert strange to strings and check if apply works:

    df['strange'] = df['strange'].astype(str)
    print df['strange'].apply(lambda x: len(x))
    #0    3
    #1    3
    #2    3
    

    (There is a suspicious discrepancy between df_cleaned and df_clean there in your question, is it a typo or a mistake in the code that causes the problem?)