Search code examples
pythonpython-3.xclass-variables

How to pick up the correct class (NameError)


I have been working on a project where I want to gather the urls and then I could just import all the modules with the scraper classes and it should register all of them into the list.

I have currently done:

import sys
import tldextract


class Scraper:
    scrapers = {}

    def __init_subclass__(scraper_class):
        Scraper.scrapers[scraper_class.url] = scraper_class # .url -> Unresolved attribute reference 'url' for class 'Scraper' 

    @classmethod
    def for_url(cls, url):
        k = tldextract.extract(url)
        return scrapers[k.domain]() #<-- Unresolved reference 'scrapers' 


class BBCScraper(Scraper):
    url = 'bbc.co.uk'

    def scrape(s):
        print(s)
        # FIXME Scrape the correct values for BBC
        return "Yay works!"


url = 'https://www.bbc.co.uk/'
scraper = Scraper.for_url(url)
scraper.scrape("yay")

My currently problem right now is that I am not able to continue to execute the code as I am not able to return scrapers[k.domain]()

Output >>> NameError: name 'scrapers' is not defined

I wonder how I can pick up the correct class as for exaple if the URL is the bbc, it should g into the BBCScraper class and then we call the scrape which later on will return the values that has been scraped on that specific website


Solution

  • Do as you did in __init_subclass__ or use cls.scrapers.

    @classmethod
    def for_url(cls, url):
        k = tldextract.extract(url)
        return Scraper.scrapers[k.domain]() 
        # or
        return cls.scrapers[k.domain]() 
    

    As for the second issue

    1. Please ask that in a separate question
    2. Please explain better what exactly you are trying to do