Search code examples
pythoninitpython-dataclasses

Frozen dataclass pathlib.Path-like initialization


I have this dataclass:

@dataclass(frozen=True)
class CacheSchema:
    id_key: str
    snapshot_key: str | None
    version: str = Info.Versions.schema_cache

and I would like to initialize it the same way pathlib.Path does it, so either pass the required arguments or an already initialized CacheSchema object. From my limited understanding I figured I would have to customize __new__(), so I did, however I can't do it exactly as Path does it because the dataclass is frozen and I'm not supposed to change the values after creation. So I came up with this:

def __new__(cls, *args, **kwargs):
    if len(args) == 1 and type(args[0]) is cls:
        return cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version)

    return super(CacheSchema, cls).__new__(cls)

My logic was: if normal arguments are passed call __init__() with the given args, otherwise unpack the existing object and recall __new__() with the unpacked values.

My issue is that cls.__new__(cls, args[0].id_key, args[0].snapshot_key, args[0].version) doesn't do what I thought it would do (call __new__ recursively but with different args).

Running this

schema= CacheSchema('a', 'b', 'c')
schema2 = CacheSchema(schema)

raises

File "D:/x/y/z/main.py", line 10, in <module>
    schema2 = CacheSchema(schema)
TypeError: CacheSchema.__init__() missing 1 required positional argument: 'snapshot_key'

Solution

  • __init__ is called after __new__.

    Creation is essentially:

    @classmethod
    def __call__(cls, *args, **kwargs):
        self = cls.__new__(cls, *args, **kwargs)
        cls.__init__(self, *args, **kwargs)
        return self
    

    Therefore, you cannot repack args.
    Also see: Python: override __init__ args in __new__

    pathlib.Path-like instantiation

    Unlike your dataclass, pathlib.Path doesn't have an __init__ method with required parameters.

    Its __new__ method calls another class method _from_parts1 that does self = object.__new__(cls) and then sets attributes on self directly.

    To do something like this on your dataclass with only the __new__ method would be:

    @dataclass(frozen=True)
    class CacheSchema:
        id_key: str
        snapshot_key: str
        version: str = ""
    
        def __new__(cls, *args, **kwargs):
            if not hasattr(cls, '_init'):
                setattr(cls, '_init', cls.__init__)
                delattr(cls, '__init__')
    
            self = object.__new__(cls)
        
            if len(args) == 1 and type(args[0]) is cls:
                cls._init(self, args[0].id_key, args[0].snapshot_key, args[0].version)
                return self
    
            cls._init(self, *args, **kwargs)
            return self
    

    1 This has other issues: https://github.com/python/cpython/issues/85281

    pathlib.Path-like initialisation

    You need to wrap the __init__ function added by @dataclass.

    You can do this with another decorator:

    def init_accept_instance(cls):
        init = cls.__init__
    
        def __init__(self, *args, **kwargs):
            if len(args) == 1 and type(args[0]) is cls:
                args, kwargs = (), args[0].__dict__
            init(self, *args, **kwargs)
    
        cls.__init__ = __init__
        return cls
    

    Usage:

    @init_accept_instance
    @dataclass(frozen=True)
    class CacheSchema:
        id_key: str
        snapshot_key: str
        version: str = Info.Versions.schema_cache
    

    Or you can wrap __init__ in __new__:

    from functools import wraps
    
    
    @dataclass(frozen=True)
    class CacheSchema:
        id_key: str
        snapshot_key: str
        version: str = ""
    
        def __new__(cls, *_, **__):
            if not hasattr(cls.__init__, '__wrapped__'):
                @wraps(cls.__init__)
                def __init__(self, *args, **kwargs):
                    if len(args) == 1 and type(args[0]) is cls:
                        args = (args[0].id_key, args[0].snapshot_key, args[0].version)
                    cls.__init__.__wrapped__(self, *args, **kwargs)
    
                setattr(cls, '__init__', __init__)
    
            return super().__new__(cls)