Search code examples
pythonbuildboto3bazel

How to use Bazel's py_library imports argument


I'm trying to use Boto3 in a Bazel built project but can't seem to get the correct import for the library. Because of the Boto git repository, all the sources are in folders named botocore and boto3 in the root of the repository. The imports are all boto3.boto3, with the first corresponding the name of the external dependency and second being the root folder in which the reside. How do I use the imports attribute of py_binary and py_library rules to import from the inner boto3 instead of the other one?

This is what my workspace looks like:

//WORKSPACE

BOTOCORE_BUILD_FILE = """

py_library(
    name = "botocore",
    srcs = glob([ "botocore/**/*.py" ]),
    imports = [ "botocore" ],
    visibility = [ "//visibility:public" ],
)

"""

_BOTO3_BUILD_FILE = """

py_library(
    name = "boto3",
    srcs = glob([ "boto3/**/*.py" ]),
    imports = [ "boto3" ],
    deps = [ "@botocore//:botocore" ],
    visibility = [ "//visibility:public" ],
)

"""

new_git_repository(
    name = "botocore",
    commit = "cc3da098d06392c332a60427ff434aa51ba31699",
    remote = "https://github.com/boto/botocore.git",
    build_file_content = _BOTOCORE_BUILD_FILE,
)

new_git_repository(
    name = "boto3",
    commit = "8227503d7b1322b45052a16b197ac41fedd634e9", # 1.4.4
    remote = "https://github.com/boto/boto3.git",
    build_file_content = _BOTO3_BUILD_FILE,
)

//BUILD

py_binary(
    name = "example",
    srcs = [ "example.py" ],
    deps = [
        "@boto3//:boto3",
    ],
)

//example.py

import boto3

boto3.client('')

Checking for the contents of the build folder

$ ls bazel-bin/example.runfiles/*
bazel-bin/example.runfiles/__init__.py bazel-bin/example.runfiles/MANIFEST

bazel-bin/example.runfiles/boto3:
boto3  __init__.py

bazel-bin/example.runfiles/botocore:
botocore  __init__.py

When I try to run the example script I get AttributeError: 'module' object has no attribute 'client' I can import boto3.boto3 but then using anything in it results in missing dependencies such as boto3.sessions because everything is nested in <target-name>.boto3


Solution

  • I think you're on the right track, but you're running into a subtle problem due to the ordering of the python sys.path.

    If I run your example and print out sys.path in example.py, I see that the path contains in order:

    bazel-out/local-fastbuild/bin/example.runfiles
    bazel-out/local-fastbuild/bin/example.runfiles/boto3/boto3
    bazel-out/local-fastbuild/bin/example.runfiles/boto3
    

    The second line is due to the imports = ['boto3'] in your WORKSPACE file.

    I think you want the third line to be where you get import boto3 from, because you want python to see bazel-out/local-fastbuild/bin/example.runfiles/boto3/boto3/__init__.py.

    So when python evaluates import boto3, it sees bazel-out/local-fastbuild/bin/example.runfiles/boto3/__init__.py from the first entry and uses that, instead of bazel-out/local-fastbuild/bin/example.runfiles/boto3/boto3/__init__.py from the third entry.

    I think the answer here, is to name your "workspace" something other than the directory it contains. For example:

    # WORKSPACE
    new_git_repository(
      name = "boto3_archive",
      commit = "8227503d7b1322b45052a16b197ac41fedd634e9", # 1.4.4
      remote = "https://github.com/boto/boto3.git",
      build_file_content = _BOTO3_BUILD_FILE,
    )
    
    # BUILD
    py_binary(
      name = "example",
      srcs = [ "example.py" ],
      deps = [
        "@boto3_archive//:boto3",
      ],
    )
    

    When I do this in your example, I get the following error: ImportError: No module named dateutil.parser, which I think is progress.