Search code examples
palantir-foundryfoundry-code-repositoriesfoundry-data-connectionfoundry-python-transform

how to access the data frame without my_compute_function


How to use the data set without my_compute_function. From file1 in repository, I want to call a function which is defined in another file. In the second file, I want to make use of the data set, my_input_integration, may be without my_compute_function. How to combine the datasets from two different repository files. I do not want to combine in one file because I want to use 2nd file as utility file. It would be great if anyone can answer this.

Repository File 1

from transforms.api import transform, Input, Output


@transform(
    my_output=Output("/my/output"),
    my_input=Input("/my/input"),
)
def my_compute_function(my_input, my_output):
    return calling_function(my_input, my_output)

Repository File 2

from transforms.api import transform, Input, Output


@transform(
    my_input_integration =Input("/my/input"),
)
def calling_function(my_input, my_output, my_input_integration??)
   
    return my_output.write_dataframe(
        my_input.dataframe(),
        column_descriptions=my_dictionary
    )

Solution

  • If I understand correctly what you're trying to achieve, you can't directly do that -- any input to a transform has to be defined in that transform, and then passed into the utility function, you can't "inject" inputs.

    So the most straightforward way to achieve what you want would be to do something like this:

    file 1:

    @transform(
        my_output=Output("/my/output"),
        my_input=Input("/my/input"),
        my_input_integration=Input("/my/input_integration"),
    )
    def my_compute_function(my_input, my_output, my_input_integration):
        return calling_function(my_input, my_output, my_input_integration)
    

    file 2:

    def calling_function(my_input, my_output, my_input_integration)
        return my_output.write_dataframe(
            my_input.dataframe(),
            column_descriptions=my_dictionary
        )
    

    If you really think you need the ability to automatically "inject" datasets, and adding them as parameters to your transform would be too cumbersome, it would be possible to take a more sophisticated approach where you define a custom wrapper that you apply to the transform function that makes the input datasets automatically available. I would really avoid this though, as it adds a lot of complexity and "magic" to the code that will be hard to understand for a newcomer and casual reader.