Search code examples
trainsclearml

Can ClearML (formerly Trains) work a local server?


I am trying to start my way with ClearML (formerly known as Trains).

I see on the documentation that I need to have server running, either on the ClearML platform itself, or on a remote machine using AWS etc.

I would really like to bypass this restriction and run experiments on my local machine, not connecting to any remote destination.

According to this I can install the trains-server on any remote machine, so in theory I should also be able to install it on my local machine, but it still requires me to have Kubernetes or Docker, but I am not using any of them.

Anyone had any luck using ClearML (or Trains, I think it's still quite the same API and all) on a local server?

  • My OS is Ubuntu 18.04.

Solution

  • Disclaimer: I'm a member of the ClearML team (formerly Trains)

    I would really like to bypass this restriction and run experiments on my local machine, not connecting to any remote destination.

    A few options:

    1. The Clearml Free trier offers free hosting for your experiments, these experiment are only accessible to you, unless you specifically want to share them among your colleagues. This is probably the easiest way to get started.
    2. Install the ClearML-Server basically all you need is docker installed and you should be fine. There are full instructions here , this is the summary:
    echo "vm.max_map_count=262144" > /tmp/99-trains.conf
    sudo mv /tmp/99-trains.conf /etc/sysctl.d/99-trains.conf
    sudo sysctl -w vm.max_map_count=262144
    sudo service docker restart
    
    sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    sudo chmod +x /usr/local/bin/docker-compose
    
    sudo mkdir -p /opt/trains/data/elastic_7
    sudo mkdir -p /opt/trains/data/mongo/db
    sudo mkdir -p /opt/trains/data/mongo/configdb
    sudo mkdir -p /opt/trains/data/redis
    sudo mkdir -p /opt/trains/logs
    sudo mkdir -p /opt/trains/config
    sudo mkdir -p /opt/trains/data/fileserver
    
    sudo curl https://raw.githubusercontent.com/allegroai/trains-server/master/docker-compose.yml -o /opt/trains/docker-compose.yml
    docker-compose -f /opt/trains/docker-compose.yml up -d
    
    1. ClearML also supports full offline mode (i.e. no outside connection is made). Once your experiment completes, you can manually import the run to your server (either self hosted or free tier server)
    from clearml import Task
    Task.set_offline(True)
    task = Task.init(project_name='examples', task_name='offline mode experiment')
    

    When the process ends you will get a link to a zip file containing the output of the entire offline session:

    ClearML Task: Offline session stored in /home/user/.clearml/cache/offline/offline-2d061bb57d9e408a9420c4fe81e26ad0.zip
    

    Later you can import the session with:

    from clearml import Task
    Task.import_offline_session('/home/user/.clearml/cache/offline/offline-2d061bb57d9e408a9420c4fe81e26ad0.zip')