Search code examples
pythonpandasdataframetype-conversioninteger

extract number of ranking position in pandas dataframe


I have a pandas dataframe with a column named ranking_pos. All the rows of this column look like this: #123 of 12,216.

The output I need is only the number of the ranking, so for this example: 123 (as an integer).

How do I extract the number after the # and get rid of the of 12,216?

Currently the type of the column is object, just converting it to integer with .astype() doesn't work because of the other characters.


Solution

  • You can use .str.extract:

    df['ranking_pos'].str.extract(r'#(\d+)').astype(int)
    

    or you can use .str.split():

    df['ranking_pos'].str.split(' of ').str[0].str.replace('#', '').astype(int)