Polars - drop duplicate row based on column subset but keep first

Given the following table, I'd like to remove the duplicates based on the column subset col1,col2. I'd like to keep the first row of the duplicates though:

data = {
    'col1': [1, 2, 3, 1, 1],
    'col2': [7, 8, 9, 7, 7],
    'col3': [3, 4, 5, 6, 8]
}
tmp = pl.DataFrame(data)

┌──────┬──────┬──────┐
│ col1 ┆ col2 ┆ col3 │
│ ---  ┆ ---  ┆ ---  │
│ i64  ┆ i64  ┆ i64  │
╞══════╪══════╪══════╡
│ 1    ┆ 7    ┆ 3    │
│ 2    ┆ 8    ┆ 4    │
│ 3    ┆ 9    ┆ 5    │
│ 1    ┆ 7    ┆ 6    │
│ 1    ┆ 7    ┆ 9    │
└──────┴──────┴──────┘

The result should be

┌──────┬──────┬──────┐
│ col1 ┆ col2 ┆ col3 │
│ ---  ┆ ---  ┆ ---  │
│ i64  ┆ i64  ┆ i64  │
╞══════╪══════╪══════╡
│ 1    ┆ 7    ┆ 3    │
│ 2    ┆ 8    ┆ 4    │
│ 3    ┆ 9    ┆ 5    │
└──────┴──────┴──────┘

Usually I'd do this with pandas df["col1","col2"].is_duplicated(keep='first'), but polars dl.is_duplicated() marks all rows as duplicates including the first occurence.

Solution

You can use DataFrame.unique; the flexible keep keyword argument is here in Polars.

tmp.unique(('col1', 'col2'), keep='first', maintain_order=True)