Search code examples
pythonpython-3.x

Class inheritance where the children are simple variable-only classes


I am working with some YAML patches. These patches are a similar structure, but contain different values. The values are often difficult to remember, so I want to abstract them away into class instances that I can reference.

Here is the approach I have taken so far:

class YamlPatch:
    def __init__(self, kind, name, namespace, op, path, value):
        target={
            "kind": kind,
            "name": name,
            "namespace": namespace
        },
        scalar=[{
            "op": op,
            "path": path,
            "value": value
        }]

        self.yaml = (target, scalar)

class PatchA(YamlPatch):
    def __init__(self, name):
        namespace = "my-namespace"
        kind = "test"
        op = "replace"
        path = "/test"
        value = "hello"

        super().__init__(kind, name, namespace, op, path, value)

class PatchB(YamlPatch):
    def __init__(self, path):
        namespace = "my-namespace"
        name = "my-name"
        kind = "test"
        op = "replace"
        value = "hello"

        super().__init__(kind, name, namespace, op, path, value)

### Insert 4 or 5 other types of patches here...

patches = []
patches.append(PatchA("hello").yaml)
for app in ["app1", "app2"]:
    patches.append(PatchB(f"/{app}").yaml)

print(patches)

### output: [(({'kind': 'test', 'name': 'hello', 'namespace': 'my-namespace'},), [{'op': 'replace', 'path': '/test', 'value': 'hello'}]), (({'kind': 'test', 'name': 'my-name', 'namespace': 'my-namespace'},), [{'op': 'r
eplace', 'path': '/app1', 'value': 'hello'}]), (({'kind': 'test', 'name': 'my-name', 'namespace': 'my-namespace'},), [{'op': 'replace', 'path': '/app2', 'value': 'hello'}])]

This feels messy and repetitive, especially when you add in type-hinting and comments. Not very DRY. Some of the values are fairly common defaults, and having to __init__ then super() in every child class (patch) doesn't feel good.

I tried using dataclasses, but because the required "input" arguments for the child classes are different, I would have to use the kw_only argument which can be tricky to remember with so many different patches (e.g. PatchA(value="blah") or was it PatchA(name="blah"), I can't remember?).

In short, I'm looking for the quickest and most efficient way to write code which allows me to reference a memorable, simple name (I've called them PatchA and PatchB here but in real code they'll be something unique and obvious to the maintainers) and return the correctly-formatted YAML patch. E.g. print(PatchA).

I am using Python 3.11.

--- EDIT FOR CLARIFICATION

The reason I want to avoid having to use keyword arguments is because each patch has a different set of values, and they're not simple to remember. Also, if the values change, I want to be able to change them in one place rather than every time they're referenced in my code.

Here's a realistic (albeit abridged) example:

class YamlPatch:
  yaml = ...

class PackageApplicationFromGit(YamlPatch):
  path = "/spec/path"
  name = f"application-{application}"
  value = f"/some/applications/path/{application}"

class AppsGitRepoBranchPatch(YamlPatch):
  kind = "GitRepository"
  path = "/spec/ref/branch"
  value = "my-branch-name"

The two patches have the same structure, but wildly different values. These values are all static apart from a single argument, e.g. a branch name or an application name.


Solution

  • You could use a dataclass and eliminate the subclasses.

    from dataclasses import dataclass
    
    
    @dataclass
    class YamlPatch:
      namespace :str = "my-namespace"
      kind      :str = "test"
      name      :str = "my-name"
      op        :str = "replace"
      path      :str = "/test"
      value     :str = "hello"
    
      def __post_init__(self) -> None:
        target={
          "namespace": self.namespace ,
          "kind"     : self.kind      ,
          "name"     : self.name      ,
        }
        
        scalar=[{
          "op"   : self.op    ,
          "path" : self.path  ,
          "value": self.value ,
        }]
        
        self.yaml = (target, scalar)
        
      def __str__(self) -> str:
        return str(self.yaml)
    
    
    if __name__ == "__main__":
      patches = [YamlPatch(name="hello").yaml]
      
      for app in ("app1", "app2"):
        patches.append(YamlPatch(path=f"/{app}").yaml)
      
      print(*patches, sep="\n")
    

    To be honest, the only thing a dataclass is really doing for you here, is eliminating the need for you to do all the boiler-plate in the __init__ method to turn the constructor arguments into properties. It also makes it a lot cleaner, since you won't need a constructor with 6 typed and defaulted arguments in it. If your YamlPatch instances need to be more dynamic, you could refactor the __post_init__ to a property.

    from dataclasses import dataclass
    
    
    @dataclass
    class YamlPatch:
      namespace :str = "my-namespace"
      kind      :str = "test"
      name      :str = "my-name"
      op        :str = "replace"
      path      :str = "/test"
      value     :str = "hello"
    
      @property
      def yaml(self) -> tuple:
        target={
          "namespace": self.namespace ,
          "kind"     : self.kind      ,
          "name"     : self.name      ,
        }
        
        scalar=[{
          "op"   : self.op    ,
          "path" : self.path  ,
          "value": self.value ,
        }]
        
        return (target, scalar)
        
      def __str__(self) -> str:
        return str(self.yaml)
    
    

    I tried using dataclasses, but because the required "input" arguments for the child classes are different, I would have to use the kw_only argument which can be tricky to remember with so many different patches (e.g. PatchA(value="blah") or was it PatchA(name="blah"), I can't remember?).

    In your example all instances have the same properties, regardless of how you subclassed the super, so your KW_ONLY argument doesn't make sense. As far as remembering individual patch values is concerned - you should definitely be remembering those. You seem to be looking for a system where you arbitrarily just smack things together and/or have numerous arbitrary classes named after the value they each accept. That's unnecessarily bloated and you'll have just as much trouble keeping track of all the classes as you have keeping track of the necessary value. In short: the lazy approach is more cumbersome than simply not being lazy. You've already proved this to yourself and admitted it with your attempt.


    Example from my comment. Not a lookup table, but essentially the same results.

    # you have to figure out where `application` is coming from
    PackageApplicationFromGit = dict(
      path = "/spec/path",
      name = f"application-{application}",
      value = f"/some/applications/path/{application}",
    )
    
    AppsGitRepoBranchPatch = dict(
      kind = "GitRepository",
      path = "/spec/ref/branch",
      value = "my-branch-name",
    )
    
    patch = YamlPatch(**AppsGitRepoBranchPatch)