Search code examples
pythonpandasdataframedata-sciencedata-analysis

How to change a string value with comma (343,543) to int in a dataframe with Python


I have a CVS file IMDb list. The number of votes is written like 345,545, and Python sees this like a string. I want to change this to a number value for using operations like <,+,% and I want to add those values in a new column.

def change_int(x):
    y = x.split(",")
    z = int(y[0] + y[1])
    return z

df["imdbVotes2"] = df.imdbVotes.apply(change_int(df["imdbVotes"]))

I tried to use a function like this.

I expect:

0    343,564    343564
1    676,565    676565

Solution

  • Your function has a small problem. The split function returns a list of strings, but you can’t directly concatenate those strings. You need to join the string first and then convert the result to an integer.

    def change_int(x):
      y = x.split(",")
      z = int("".join(y))
      return z
    

    This function will now take an input string like '345, 545' and return a corresponding integer 345545.