Scrapy scraping all pages in domain

I'm tearing my hair out trying to get scrapy to look at all pages in a domain. I tried the rules = [Rule(LinkExtractor(), callback='parse_item', follow=True)] way (instead of the dragon_start function in my code), and didn't get anywhere. Now I'm trying to extract all links and iterate on that list, and that's not working either. What am I failing to do! Copilot isn't helping, and I looked at pretty much all the other SO posts...

from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from scrapy import Request

class DvSpider(CrawlSpider):
    name = "dvspider"
    start_urls = ["https://dragonvale.fandom.com/wiki/Dragons"]
    allowed_domains = ["dragonvale.fandom.com/wiki"]

    def dragon_start(self, response):
        links = response.css('a::attr(href)').extract()
        for link in links:
            yield Request(response.urljoin(link), self.parse_item)

    def parse_item(self, response):
        if response.url[-7] == '_':
            dragon = response.css('table.dragonbox')

            if dragon:
                rows = dragon.xpath('//tr')
                yield {
                    'DragonName': rows[0].css('b::text').get().strip(),
                }

I don't get any errors, and scrapy crawls start_urls with a (200) code. But then the spider immediately says INFO: Closing spider (finished).

Solution

I think you are missing the start_requests entry point.

For example,

from typing import Iterable
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from scrapy import Request

class DvSpider(CrawlSpider):
    name = "dvspider"
    start_urls = ["https://dragonvale.fandom.com/wiki/Dragons"]
    allowed_domains = ["dragonvale.fandom.com/wiki"]

    def start_requests(self):
        yield Request(self.start_urls[0], self.dragon_start)

    def dragon_start(self, response):
        links = response.css('a::attr(href)').extract()
        for link in links:
            yield Request(response.urljoin(link), self.parse_item)

    def parse_item(self, response):
        if response.url[-7] == '_':
            dragon = response.css('table.dragonbox')

            if dragon:
                rows = dragon.xpath('//tr')

                yield {
                    'DragonName': rows[0].css('b::text').get().strip(),
                }

This should call the dragon_start function and start iterating the links from there.