I hate writing things twice, so I came up with a decent way to not have to write things twice. However, this seems to break my type-hinting:
from enum import Enum
from dataclasses import make_dataclass, field, dataclass
class DatasetNames(Enum):
test1 = "test1_string"
test2 = "test2_string"
test3 = "test3_string"
def get_path(s: str) -> str:
return s + "_path"
# the normal way to do this, but I have to type every new dataset name twice
# and there's a lot of duplicate code
@dataclass(frozen=True)
class StaticDatasetPaths:
test1 = get_path("test1_string")
test2 = get_path("test2_string")
test3 = get_path("test3_string")
# mypy recognizes that `StaticDatasetPaths` is a class
# mypy recognizes that `StaticDatasetPaths.test2` is a string
print(StaticDatasetPaths.test2) # 'test2_string_path'
# this is my way of doing it, without having to type every new dataset name twice and no duplicate code
DynamicDatasetPaths = make_dataclass(
'DynamicDatasetPaths',
[
(
name.name,
str,
field(default=get_path(name.value))
)
for name in DatasetNames
],
frozen=True
)
# mypy thinks `DynamicDatasetPaths` is a `variable` of type `type`
# mypy thinks that `DynamicDatasetPaths.test2` is an `function` of type `Unknown`
print(DynamicDatasetPaths.test2) # 'test2_string_path'
How can I let mypy know that DynamicDatasetPaths is a frozen dataclass whose attributes are strings?
Normally when I run into cases like this, I'm able to just use a cast
and tell mypy what the right type is, but I don't know the correct type for "frozen dataclass whose attributes are strings".
(Also, if there's a better way in general to not have the duplicate code, I'd be happy to hear about that as well.)
A data class is meant to create instances. Since you are not instantiating the data class but instead accessing test1
, test2
, etc. as class attributes, you don't really need a data class at all, but can simply make path
a property of the Enum
class instead. And since all the members of your Enum
class have string values, you can make it a StrEnum
class instead for easier string operations:
from enum import StrEnum
class DatasetNames(StrEnum):
test1 = "test1_string"
test2 = "test2_string"
test3 = "test3_string"
@property
def path(self) -> str:
return self + '_path'
print(DatasetNames.test2.path) # outputs test2_string_path
If the get_path
function is expensive in your actual use case, consider making path
a cached_property
instead.