Search code examples
bazelbazel-rules

What is the relationship between DefaultInfo and PyInfo


It's not clear to me what the difference between the DefaultInfo runfiles's transitive_files and PyInfo transitive_sources are. Are they redundant or is there an important difference?

For example, I have a custom starlark rule which I want to conform as a PyInfo provider, but I want to add an additional provider so I can't use the native py_library rule.

    transitive_sources = [dep[PyInfo].transitive_sources for dep in ctx.attr.deps]
    return struct(providers = [
        DefaultInfo(
            files = depset(sources + outs),
            runfiles = ctx.runfiles(files = sources + outs, transitive_files = transitive_sources)
        ),
        PyInfo(
            transitive_sources = depset(direct = sources + outs, transitive = transitive_sources),
            imports = depset(
                direct = [_path_join(ctx.workspace_name, ctx.label.package, im) for im in ctx.attr.imports],
                transitive = [dep[PyInfo].imports for dep in ctx.attr.deps]
            )
        ),
        _EggLibraryInfo(aditional_info="other stuff"),
    ])

I'm creating redundant depsets to satisfy these providers, which makes me think maybe I'm doing it wrong.

I have also tried another method of looping over all the default_runfiles of the deps, and using runfiles.merge for DefaultInfo. For simple cases, these methods appear equivalent, but I don't know if there are other scenarios where the approaches would diverge.

The PyInfo documentation could use a section on how transitive_sources fits into DefaultInfo, and why additional mechanisms outside of runfiles needs to be provided. https://docs.bazel.build/versions/master/skylark/lib/PyInfo.html


Solution

  • DefaultInfo is a known type to Bazel:

    • files controls which files are built when you bazel build the target,
    • runfiles defines which files need to be present in the sandbox when executing the target.

    PyInfo is exclusively used by Python rules and is used to propagate metadata to consuming targets.

    My guess is that the duplication is necessary because the values may differ, so removing the duplication will either mean Bazel doesn't build/include the right files, or consuming Python rules are missing information.