Search code examples
amazon-s3miniodvc

DVC connect to Min.IO to access S3


What is the proper way to connect DVC to Min.IO that is connected to some buckets on S3.

AWS-S3(My_Bucket) > Min.io(MY_Bucket aliased as S3)

Right now i am accessing my bucket by using mc for example mc cp s3/my_bucket/datasets datasets to copy stuff from there. But I need to setup my DVC to work with min.io as a hub between AWS.S3 and DVC so i can use for example "DVC mc-S3 pull" and "DVC AWS-S3 pull".

How do i got for it because while googling i couldn't find anything that i could easily follow.


Solution

  • It looks like you are looking for a combination of things.

    First, Jorge mentioned you can set endpointurl to access Minio the same way as you would access regular S3:

    dvc remote add -d minio-remote s3://mybucket/path
    dvc remote modify minio-remote endpointurl https://minio.example.com                          
    

    Second, it seems you can create two remotes - one for S3, one for Minio and use -r option that is available for many data management related commands:

    dvc pull -r minio-remote
    dvc pull -r s3-remote
    dvc push -r minio-remote
    ...
    

    This way you could push/pull data to/from a specific storage.

    But I need to setup my DVC to work with min.io as a hub between AWS.S3 and DVC

    There are other possible ways, I think to organize this. It indeed depends on what semantics you expect from DVC mc-S3 pull. Please let us know if -r is not enough and clarify the question- that would help us here.