I have this dataclass:
@dataclass(frozen=True)
class CacheSchema:
id_key: str
snapshot_key: str | None
version: str = Info.Versions.schema_cache
and I would like to initialize it the same way pathlib.Path
does it, so either pass the required arguments or an already initialized CacheSchema
object. From my limited understanding I figured I would have to customize __new__()
, so I did, however I can't do it exactly as Path
does it because the dataclass is frozen and I'm not supposed to change the values after creation. So I came up with this:
def __new__(cls, *args, **kwargs):
if len(args) == 1 and type(args[0]) is cls:
return cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version)
return super(CacheSchema, cls).__new__(cls)
My logic was: if normal arguments are passed call __init__()
with the given args, otherwise unpack the existing object and recall __new__()
with the unpacked values.
My issue is that cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version)
doesn't do what I thought it would do (call __new__
recursively but with different args).
Running this
schema= CacheSchema('a', 'b', 'c')
schema2 = CacheSchema(schema)
raises
File "D:/x/y/z/main.py", line 10, in <module>
schema2 = CacheSchema(schema)
TypeError: CacheSchema.__init__() missing 1 required positional argument: 'snapshot_key'
__init__
is called after __new__
.
Creation is essentially:
@classmethod
def __call__(cls, *args, **kwargs):
self = cls.__new__(cls, *args, **kwargs)
cls.__init__(self, *args, **kwargs)
return self
Therefore, you cannot repack args
.
Also see: Python: override __init__ args in __new__
Unlike your dataclass, pathlib.Path
doesn't have an __init__
method with required parameters.
Its __new__
method calls another class method _from_parts
1 that does self = object.__new__(cls)
and then sets attributes on self
directly.
To do something like this on your dataclass with only the __new__
method would be:
@dataclass(frozen=True)
class CacheSchema:
id_key: str
snapshot_key: str
version: str = ""
def __new__(cls, *args, **kwargs):
if not hasattr(cls, '_init'):
setattr(cls, '_init', cls.__init__)
delattr(cls, '__init__')
self = object.__new__(cls)
if len(args) == 1 and type(args[0]) is cls:
cls._init(self, args[0].id_key, args[0].snapshot_key, args[0].version)
return self
cls._init(self, *args, **kwargs)
return self
1 This has other issues: https://github.com/python/cpython/issues/85281
You need to wrap the __init__
function added by @dataclass
.
You can do this with another decorator:
def init_accept_instance(cls):
init = cls.__init__
def __init__(self, *args, **kwargs):
if len(args) == 1 and type(args[0]) is cls:
args, kwargs = (), args[0].__dict__
init(self, *args, **kwargs)
cls.__init__ = __init__
return cls
Usage:
@init_accept_instance
@dataclass(frozen=True)
class CacheSchema:
id_key: str
snapshot_key: str
version: str = Info.Versions.schema_cache
Or you can wrap __init__
in __new__
:
from functools import wraps
@dataclass(frozen=True)
class CacheSchema:
id_key: str
snapshot_key: str
version: str = ""
def __new__(cls, *_, **__):
if not hasattr(cls.__init__, '__wrapped__'):
@wraps(cls.__init__)
def __init__(self, *args, **kwargs):
if len(args) == 1 and type(args[0]) is cls:
args = (args[0].id_key, args[0].snapshot_key, args[0].version)
cls.__init__.__wrapped__(self, *args, **kwargs)
setattr(cls, '__init__', __init__)
return super().__new__(cls)