Search code examples
pythonpandasooppython-dataclasses

How to combine pandas, dataclasses to import data


I'm trying to learn classes with dataclasses

@dataclass
class Collect:
    url : str
    def collect(url):    
        df = pd.read_csv(url)
        return df
df = Collect("national_news_scrape.csv")
df.head()

But I get error:

AttributeError: 'Collect' object has no attribute 'head'

Not sure why.


Solution

  • The cause

    You are getting the AttributeError because your code constructs the Collect object with Collect("national_news_scrape.csv"), but does not then call the collect() method to return a pandas DataFrame.

    The head() method is defined on a DataFrame instance and not an instance of your Collect class.

    The fix

    I do however sense some confusion! I will assume that your code is a Minimal (non-)Working Example, which would explain why you are using @dataclass here when it doesn't seem necessary.

    Danila has posted some working code, though I haven't tested it myself - please consult this answer for a correction.

    Additionally, please note the change in signature of the collect() method - it now no longer requires a url positional argument because it uses the self.url instance attribute that is set by the constructor from @dataclasss.