Remove duplicate rows in DataFrame based on one column containing a substring

I have a dataframe like the following:

ID, Components
1,  "Room 1, ABC"
2,  "Room 2, ABC"
3,  "Room 3, DEF"
4,  "Room 1, DEF"
5,  "Room 3, DEF"

I need to filter the dataframe so that there is only one row per room and the first occurrence of a given room is kept:

ID, Components
1,  "Room 1, ABC"
2,  "Room 2, ABC"
3,  "Room 3, DEF"

As shown above, we can see the rows with ID 4&5 have been removed as "Room 1" and "Room 3" is used in row's with IDs 1 and 3.

Alternatively, a count of unique rooms would also work, however the remainder of the string for Components can repeat, e.g. there can be numerous ABCs and DEFs but only 1 Room 1/2/3...

Therefore counting unique entries in the Components column will not work. It must be unique only for "Room n".

Solution

Input data:

df = pd.DataFrame({'ID': np.arange(1, 6), 'Components': ['Room 1, ABC', 'Room 2, ABC', 'Room 3, DEF', 'Room 1, DEF', 'Room 3, DEF']})

Use str.extract:

df.loc[~df['Components'].str.extract('(\d+)').duplicated()]

Output:

   ID   Components
0   1  Room 1, ABC
1   2  Room 2, ABC
2   3  Room 3, DEF