Search code examples
dockerdocker-composemlflow

Unable to connect Mlflow server to my mlflow project image


My final purpose is to run experiment from an Api.

the experiment come from : https://github.com/mlflow/mlflow/tree/master/examples/tensorflow/tf2 but export the file in my custom git where I clone it, in the image below ->

I have 2 images in my docker compose : tree project :

|_app/
| |_Dockerfile
|
|_mlflow/
| |_Dockerfile
|
|_docker-compose.yml

app/Dockerfile

FROM continuumio/anaconda3

ENV APP_HOME ./
WORKDIR ${APP_HOME}
RUN conda config --append channels conda-forge
RUN conda install --quiet --yes \
    'mlflow' \
    'psycopg2' \
    'tensorflow'
RUN pip install pylint
RUN pwd;ls \
&& git clone https://github.com/MChrys/QuickSign.git 
RUN pwd;ls \
    && cd QuickSign \
    && pwd;ls

COPY . .

#RUN conda install jupyter 
#CMD jupyter notebook --ip=0.0.0.0 --port=8888 --allow-root --no-browser
CMD cd QuickSign && mlflow run .

mlflow/Dockerfile

FROM python:3.7.0

RUN pip install mlflow

RUN mkdir /mlflow/

CMD mlflow server \
    --backend-store-uri /mlflow \
    --host 0.0.0.0

docker-compose.yml

version: '3'
services:
  notebook:
    build:
      context: ./app
    ports:
      - "8888:8888"
    depends_on: 
      - mlflow
    environment: 
      MLFLOW_TRACKING_URI: 'http://mlflow:5000'
  mlflow:
    build:
      context: ./mlflow
    expose: 
      - "5000"
    ports:
      - "5000:5000"

when I docker-compose up the image I obtain :

notebook_1_74059cdc20ce |     response = requests.request(**kwargs)
notebook_1_74059cdc20ce |   File "/opt/conda/lib/python3.7/site-packages/requests/api.py", line 60, in request
notebook_1_74059cdc20ce |     return session.request(method=method, url=url, **kwargs)
notebook_1_74059cdc20ce |   File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
notebook_1_74059cdc20ce |     resp = self.send(prep, **send_kwargs)
notebook_1_74059cdc20ce |   File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
notebook_1_74059cdc20ce |     r = adapter.send(request, **kwargs)
notebook_1_74059cdc20ce |   File "/opt/conda/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
notebook_1_74059cdc20ce |     raise ConnectionError(e, request=request)
notebook_1_74059cdc20ce | requests.exceptions.ConnectionError: HTTPConnectionPool(host='mlflow', port=5000): Max retries exceeded with url: /api/2.0/mlflow/runs/create (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd5db4edc50>: Failed to establish a new connection: [Errno 111] Connection refused'))

The problem look like that I run a project which is not found in the server images, as I run it in the app image, but I don't know how figure it out I have to trigger the experiment from a futur flask app


Solution

  • The problem came from docker for windows, I was unable to make working docker compose on it but there are no problem to build it when I run it on virtual machine with ubuntu.