I have this code:
from dataclasses import dataclass
from typing import List
@dataclass(eq=True, frozen=True)
class TestClass:
field1: str
field_list: List[str]
duplicate_list = [TestClass("foo", ["bar", "cat"]), TestClass("foo", ["bar", "cat"]), TestClass("foo", ["bar", "caz"])]
unique_list = remove_duplicates(duplicate_list)
def remove_duplicates(duplicate_list: List[TestClass]) -> List[TestClass]:
return list(set(duplicate_list))
Now I want to check the list for duplicates. I tried to convert the list to a set like shown above. I also tried using
return list( dict.fromkeys(duplicate_list) )
Both approaches do not work as my class contains a list. Because of this the __hash__
function generated by the dataclass module does not work. It gives the error: unhashable type: 'list'
What would the correct approach be to remove the duplicate dataclass-elements? Would I need to write a custom __hash__
function? Or would it be possible to replace the list with some form of immutable list?
You can replace list
with tuple
(immutable list in python)
from dataclasses import dataclass
from typing import List, Tuple
@dataclass(eq=True, frozen=True)
class TestClass:
field1: str
field_list: Tuple[str, str]
duplicate_list = [TestClass("foo", ("bar", "cat")), TestClass("foo", ("bar", "cat")), TestClass("foo", ("bar", "caz"))]
Then your original remove_duplicates
implementation will work correctly.
def remove_duplicates(duplicate_list: List[TestClass]) -> List[TestClass]:
return list(set(duplicate_list))