Search code examples
pythonpandasnumpydataframedata-analysis

Get numeric part of string column and cast to integer


The question is pretty simple actually, I just couldn't figure it out. There is a Fifa Dataset that I use, and I'd like to convert all weight column to integer. so: first I drop the lbs, then I convert to integer.

fifa["Weight"].head()
           
    0    159lbs
    1    183lbs
    2    150lbs
    3    168lbs
    4    154lbs
    Name: Weight, dtype: object


fifa.Weight = [int(x.strip("lbs")) if type(x)==str else x for x in fifa.Weight] 

I know that I could use this but I don't want to.

fifa_weight =[]

for i in fifa["Weight"]:

    if(type(i)==str):

        fifa_weight.append(int(i.strip("lbs")))

## There are some missing values in the Weight column that's why I use type(i)==str.

I get the values inside of the fifa["Weight"] column and try to put it inside the fifa_weight column but I wasn't able to change the columns(because of missing values) so.. how can I do that with for loop? I want my fifa["Weight"] column to be full of integers.


Solution

  • Given

    >>> df
       Weight
    0  159lbs
    1  183lbs
    2  150lbs
    3  168lbs
    4  154lbs
    

    you can shave off the last three characters and then convert the strings to integers via

    >>> df['Weight'] = df['Weight'].str[:-3].astype(int)
    >>> df
       Weight
    0     159
    1     183
    2     150
    3     168
    4     154