Frozen dataclass pathlib.Path-like initialization

I have this dataclass:

@dataclass(frozen=True)
class CacheSchema:
    id_key: str
    snapshot_key: str | None
    version: str = Info.Versions.schema_cache

and I would like to initialize it the same way pathlib.Path does it, so either pass the required arguments or an already initialized CacheSchema object. From my limited understanding I figured I would have to customize __new__(), so I did, however I can't do it exactly as Path does it because the dataclass is frozen and I'm not supposed to change the values after creation. So I came up with this:

def __new__(cls, *args, **kwargs):
    if len(args) == 1 and type(args[0]) is cls:
        return cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version)

    return super(CacheSchema, cls).__new__(cls)

My logic was: if normal arguments are passed call __init__() with the given args, otherwise unpack the existing object and recall __new__() with the unpacked values.

My issue is that cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version) doesn't do what I thought it would do (call __new__ recursively but with different args).

Running this

schema= CacheSchema('a', 'b', 'c')
schema2 = CacheSchema(schema)

raises

File "D:/x/y/z/main.py", line 10, in <module>
    schema2 = CacheSchema(schema)
TypeError: CacheSchema.__init__() missing 1 required positional argument: 'snapshot_key'

Solution

__init__ is called after __new__.

Creation is essentially:

@classmethod
def __call__(cls, *args, **kwargs):
    self = cls.__new__(cls, *args, **kwargs)
    cls.__init__(self, *args, **kwargs)
    return self

Therefore, you cannot repack args.
Also see: Python: override __init__ args in __new__

pathlib.Path-like instantiation

Unlike your dataclass, pathlib.Path doesn't have an __init__ method with required parameters.

Its __new__ method calls another class method _from_parts¹ that does self = object.__new__(cls) and then sets attributes on self directly.

To do something like this on your dataclass with only the __new__ method would be:

@dataclass(frozen=True)
class CacheSchema:
    id_key: str
    snapshot_key: str
    version: str = ""

    def __new__(cls, *args, **kwargs):
        if not hasattr(cls, '_init'):
            setattr(cls, '_init', cls.__init__)
            delattr(cls, '__init__')

        self = object.__new__(cls)
    
        if len(args) == 1 and type(args[0]) is cls:
            cls._init(self, args[0].id_key, args[0].snapshot_key, args[0].version)
            return self

        cls._init(self, *args, **kwargs)
        return self

¹ This has other issues: https://github.com/python/cpython/issues/85281

pathlib.Path-like initialisation

You need to wrap the __init__ function added by @dataclass.

You can do this with another decorator:

def init_accept_instance(cls):
    init = cls.__init__

    def __init__(self, *args, **kwargs):
        if len(args) == 1 and type(args[0]) is cls:
            args, kwargs = (), args[0].__dict__
        init(self, *args, **kwargs)

    cls.__init__ = __init__
    return cls

Usage:

@init_accept_instance
@dataclass(frozen=True)
class CacheSchema:
    id_key: str
    snapshot_key: str
    version: str = Info.Versions.schema_cache

Or you can wrap __init__ in __new__:

from functools import wraps


@dataclass(frozen=True)
class CacheSchema:
    id_key: str
    snapshot_key: str
    version: str = ""

    def __new__(cls, *_, **__):
        if not hasattr(cls.__init__, '__wrapped__'):
            @wraps(cls.__init__)
            def __init__(self, *args, **kwargs):
                if len(args) == 1 and type(args[0]) is cls:
                    args = (args[0].id_key, args[0].snapshot_key, args[0].version)
                cls.__init__.__wrapped__(self, *args, **kwargs)

            setattr(cls, '__init__', __init__)

        return super().__new__(cls)