Search code examples
pythonoopinheritancepython-dataclassesrepr

Correct way of updating __repr__ in Python using dataclasses and inheritance


I have the following code:

from dataclasses import MISSING, asdict, dataclass
from typing import Any
from datetime import datetime


@dataclass()
class BookMetadata():
    '''Parent class.'''
    
    isbn: str 
    title: str 
    author: str
    publisher: str
    date_published: int
    
    
    def format_time(self, unix: int) -> str:
        '''Convert unix time to %Y-%m-%d.'''
        
        return datetime.fromtimestamp(int(str(unix)[0:10])).strftime('%Y-%m-%d')
    

    def __post_init__(self):
        '''Change attributes after assignment.'''
            
        # Change date from UNIX to YYYY-MM-DD
        self.date_published = self.format_time(self.date_published)
      

@dataclass()
class RetailPrice(BookMetadata):
    '''Child class.'''
    
    def __init__(self, 
                 isbn, title, author, publisher, date_published,
                 price_usd, price_aud, price_eur, price_gbp) -> None:

        self.price_usd: float = price_usd
        self.price_aud: float = price_aud
        self.price_eur: float = price_eur
        self.price_gbp: float = price_gbp
        
        BookMetadata.__init__(self, isbn, title, author, publisher, date_published)
        # Or: super(RetailPrice, self).__init__(isbn, title, author, publisher, date_published)
        
        
    def stringify(self, obj: Any) -> str:
        '''Turn object into string.'''
        
        return str(obj)
        
        
    def __post_init__(self):
        '''Change attribute values after assignment.'''
        self.price_usd = self.stringify(self.price_usd)
        

    def __repr__(self) -> str:
        '''Update representation including parent and child class attributes.'''
        
        return f'Retailprice(isbn={super().isbn}, title={super().title}, author={super().author}, publisher={super().publisher}, date_published={super().date_published}, price_usd={self.price_usd}, price_aud={self.price_aud}, price_eur={self.price_eur}, price_gbp={self.price_gbp})'

My __repr__ method is failing with the following message: AttributeError: 'super' object has no attribute 'isbn', so I am referencing the attributes of the parent class all wrong here.

As it's possible to call the parent dataclass under the __init__ method of the child dataclass, (BookMetadata.__init__(self, isbn, title, author, publisher, date_published)), I thought that trying with super(BookMetadata, self) would work, but it failed with the same message.

How should I reference the attributes of the parent class in __repr__ within the child dataclass?


Solution

  • There's a lot wrong with that code. The field values are inconsistent with their declared types, the use of int(str(unix)[0:10]) is bizarre and likely to lead to wrong dates, RetailPrice largely abandons the use of dataclass fields in favor of writing out a bunch of __init__ arguments manually...

    We'll go over this in two parts. One that just fixes the immediate issues, and one that shows what this should look like.


    Part 1: Fixing the immediate issues.

    For the __repr__ method, the most immediate issue is the attempt to access instance attributes through super(). Attributes don't work that way. An instance doesn't have separate "superclass attributes" and "child attributes"; it just has attributes, and they're all accessed through self, regardless of what class they were set in.

    Your instance already has all the attributes you need. Contrary to your self-answer, you don't need to (and absolutely shouldn't) call super().__init__ or super().__post_init__ inside __repr__ to make the attributes available. You can just access them:

    def __repr__(self) -> str:
        return f'RetailPrice(isbn={self.isbn}, title={self.title}, author={self.author}, publisher={self.publisher}, date_published={self.date_published}, price_usd={self.price_usd}, price_aud={self.price_aud}, price_eur={self.price_eur}, price_gbp={self.price_gbp})'
    

    You'll see date_published as a Unix timestamp instead of a YYYY-MM-DD string, but that's because your subclass's __post_init__ doesn't call the superclass's __post_init__. Fix that:

    def __post_init__(self):
        super().__post_init__()
        self.price_usd = self.stringify(self.price_usd)
    

    and your __repr__ output will be okay.


    Part 2: How the code should really be written.

    You didn't need to write your own __repr__. That's one of the things that the @dataclass decorator is designed to handle for you. You didn't let @dataclass do its job, though. Instead of declaring fields and letting @dataclass generate code based on the fields, you wrote things manually.

    If you just declared fields:

    @dataclass
    class BookWithRetailPrice(BookMetadata):
        price_usd: float
        price_aud: float
        price_eur: float
        price_gbp: float
    

    @dataclass could have handled almost everything, generating a sensible __repr__, as well as __eq__ and __hash__.

    I left out the part about stringifying price_usd, because that was weird, inconsistent with the other price fields, and inconsistent with the declared type of price_usd. You can do it in __post_init__ if you really want, but it's a bad idea.

    Similarly, converting date_published to a YYYY-MM-DD string when the field is declared as an int is a bad idea. If you want a YYYY-MM-DD string, a property would probably make more sense:

    @dataclass
    class BookMetadata:
        isbn: str 
        title: str 
        author: str
        publisher: str
        publication_timestamp: int
    
        @property
        def publication_date_string(self):
            return datetime.fromtimestamp(self.publication_timestamp).strftime('%Y-%m-%d')
    

    You'll note that with the @dataclass-generated __repr__, you'll see quotation marks around the values of string fields, like title. The generated __repr__ uses the __repr__ of field values, instead of calling str on fields. This is useful for reducing ambiguity, especially if any of the field values have commas in them. Unless you have a really strong reason to do otherwise, you should let @dataclass write __repr__ that way.