Search code examples
pythonpython-dataclassesrepr

Can a dataclass field format its value for the repr?


I have a Node class holding RGB data in both hex and HSV form. I'll be using this to sort colors in various ways and would prefer the HSV tuple to remain in float form for comparisons instead of converting from a string for every use. Is there a way to specify to the dataclass field that it should format the value in a specific way similar to default values with the default_factory, i.e. a repr_factory?

def RGB2HSV(r, g, b):
    '''Returns HSV values in the range H = [0, 360], S = [0, 100], V = [0, 100]'''
    r, g, b = r / 255, g / 255, b / 255
    maxRGB = max(r, g, b)
    minRGB = min(r, g, b)
    delta = maxRGB - minRGB

    V = maxRGB
    if V == 0:
        return 0, 0, V
    
    S = delta / V * 100
    if S == 0:
        return 0, S, V * 100
    
    if V == r:
        H = (g - b) / delta
    elif V == g:
        H = 2 + (b - r) / delta
    else:
        H = 4 + (r - g) / delta
    H *= 60
    if H < 0:
        H += 360
    
    return H, S, V * 100

@dataclass
class Node:
    r: int = field(repr=False)
    g: int = field(repr=False)
    b: int = field(repr=False)
    hex: tuple[int, int, int] = field(init=False)
    hsv: tuple[float, float, float] = field(init=False)

    def __post_init__(self):
        self.hex = self.r, self.g, self.b # Generating random r, g, b numbers
        self.hsv = RGB2HSV(self.hex) # Converts the r, g, b to a tuple of floats

While I'm working out the different sorts, I'm printing out the Nodes and seeing 10 unnecessary digits of a float is distracting. As far as I can think of, would I just be better off implementing my own __repr__ for the class instead of relying on the dataclass generated one?

The reason I'm looking at the __repr__ value is because it's automatically generated by the dataclass and can make distinguishing between nearly identical colors easier than just looking at the visual output. It'll be easier to find out what to change or do next if I know what the actual numbers a color are. A portion of the end of the output:

Node(hex=(238, 0, 0), hsv=(0.0, 100.0, 93.33333333333333))
Node(hex=(238, 17, 0), hsv=(4.285714285714286, 100.0, 93.33333333333333))
Node(hex=(238, 34, 0), hsv=(8.571428571428571, 100.0, 93.33333333333333))
Node(hex=(238, 51, 0), hsv=(12.857142857142858, 100.0, 93.33333333333333))
Node(hex=(255, 0, 0), hsv=(0.0, 100.0, 100.0))
Node(hex=(255, 17, 0), hsv=(4.0, 100.0, 100.0))
Node(hex=(255, 34, 0), hsv=(8.0, 100.0, 100.0))
Node(hex=(255, 51, 0), hsv=(12.0, 100.0, 100.0))

Basically, can a format be specified to a dataclass field, similar to how a function can be specified to default_factory, in order for the generated __repr__ to format the field for me so I don't have to write my own?

...
    hsv: tuple[float, float, float] = field(init=False, repr_factory=lambda x: "{:.3f"}.format(x) for x in self.hsv)
...
Node(hex=(238, 51, 0), hsv=(12.857, 100.000, 93.333))

Solution

  • The dataclasses library currently does not support formatting fields like that. The code generated in the default __repr__ for each included field is always in the formf'field={self.field!r}'. You will have to write your own __repr__.