Search code examples
python-3.xdistributed-computingdistributeddistributed-systemray

How to use python ray for independent computers (each have its username and password) via internet(distributed computation with ip address)?


I know the basic usage of ray in one head node(the computer that user works on) and many worker nodes(other computers). This can be done by filling a yaml file for the newest ray 0.8.

However, now suppose I have three independent machines and each of which has an independent ip, user name and password. I would like to connect to one of them and use it as the head node, and the other two as the worker nodes. But I cannot find any instrutions on ray documentation for this.

Does anyone know how to make this work on ray?


Solution

  • How you can use Ray on different machines that are already set up with IPs and user names is described here:

    https://ray.readthedocs.io/en/latest/using-ray-on-a-cluster.html

    So basically you need to run ray start on all the nodes, with different parameters whether the node should be the head node or a worker node.

    It is also possible to use the Ray autoscaler in this scenario, how to do that is described here: https://ray.readthedocs.io/en/latest/autoscaling.html#quick-start-private-cluster

    Let us know if you have more questions!