Search code examples
pandasrow-number

Numbering rows in Pandas without order


I have a table Mapping that looks like this

Mapping
+---------+---------+
| user_id | prod_id |
+---------+---------+
|     41  |  N12ad  |
|      2  |  nd5Ed  |
|     73  |  bDe4d  |
|     25  |  M87he  |
+---------+---------+

I would like to add a column ranking that numbers the rows in the order that they are stored in the database. So that it looks like this

+---------+---------+---------+
| user_id | prod_id | ranking |
+---------+---------+---------+
|     41  |  N12ad  |      1  |
|      9  |  nd5Ed  |      2  |
|     73  |  bDe4d  |      3  |
|     25  |  M87he  |      4  |
+---------+---------+---------+

In Microsoft SQL, I can do

SELECT user_id, prod_id, ROW_NUMBER() OVER(PARTITION BY (SELECT NULL)) AS ranking
FROM Mapping

What would be the pandas equivalent of ROW_NUMBER() OVER(PARTITION BY (SELECT NULL))? Thank you for your help.


Solution

  • You can do this by adding a new column using range and the length of the DataFrame. The equivalent of ROW_NUMBER() with no partitioning is simply assigning row numbers in the order of the rows.

    data = {
        "user_id": [41, 9, 73, 25],
        "prod_id": ["N12ad", "nd5Ed", "bDe4d", "M87he"]
    }
    df = pd.DataFrame(data)
    
    # Adding the the ranking column
    df['ranking'] = range(1, len(df) + 1)