Search code examples
pythondockerdocker-composescrapyscrapyd

ScrapydWeb: Connection refused within docker-compose


I tried to run a couple of scrapyd services to have a simple cluster on my localhost, but only the first node works. For 2 others I get the following error

scrapydweb_1      | [2020-11-17 07:17:32,738] ERROR    in scrapydweb.utils.check_app_config: HTTPConnectionPool(host='scrapyd_node_3', port=6802): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb060b8ef50>: Failed to establish a new connection: [Errno 111] Connection refused'))
scrapydweb_1      | [2020-11-17 07:17:32,738] ERROR    in scrapydweb.utils.check_app_config: HTTPConnectionPool(host='scrapyd_node_2', port=6801): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb060a1e650>: Failed to establish a new connection: [Errno 111] Connection refused'))

I have the following docker-compose.yml file:

version: '3'
services:
  scrapyd_node_1:
    build:
      context: .
      dockerfile: ./crawlers/scrapyd/Dockerfile
    ports:
      - "6800:6800"
    volumes:
      - ./data:/var/lib/scrapyd
      - ./data/results:/app/results
    restart: unless-stopped

  scrapyd_node_2:
    build:
      context: .
      dockerfile: ./crawlers/scrapyd/Dockerfile
    ports:
      - "6801:6800"
    volumes:
      - ./data:/var/lib/scrapyd
      - ./data/results:/app/results
    restart: unless-stopped

  scrapyd_node_3:
    build:
      context: .
      dockerfile: ./crawlers/scrapyd/Dockerfile
    ports:
      - "6802:6800"
    volumes:
      - ./data:/var/lib/scrapyd
      - ./data/results:/app/results
    restart: unless-stopped


  scrapydweb:
    build:
      context: .
      dockerfile: ./crawlers/scrapydweb/Dockerfile
    environment:
      USERNAME: "test"
      PASSWORD: "test"
      SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6801,scrapyd_node_3:6802"
    links:
      - scrapyd_node_1
      - scrapyd_node_2
      - scrapyd_node_3
    ports:
      - "5000:5000"
    depends_on:
      - scrapyd_node_1
      - scrapyd_node_2
      - scrapyd_node_3

    restart: unless-stopped

What is wrong with my docker-compose file?


Solution

  • The problem is in line:

    SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6801,scrapyd_node_3:6802"
    

    Try changing it to:

    SCRAPYD_SERVERS: "scrapyd_node_1:6800,scrapyd_node_2:6800,scrapyd_node_3:6800"
    

    Explanation:

    When you defined you docker service scrapyd_node_2 for instance, you defined ports to be:

        ports:
          - "6801:6800"
    

    It means, that port 6800 from contanier is mapped to port 6801 on your host machine. Hence, when you want to declare node with hostname scrapyd_node_2, you should use it's port = scrapyd_node_2:6800.