Search code examples
palantir-foundryfoundry-code-repositories

How do I read and write to the same dataset in Code Repository?


How to read and write the same dataset in a transform? I have an input dataset (input_ds1) and another input dataset (input_ds2). When I output to one of these dataset's paths (ex.dataset2 in code below) the check fails, with a cyclical dependency error.

Below I attacked an example:

    @transform(
    input_ds1=Input('Other Namespace/Other/Foundry_support_test/dataset1'),
    input_ds2=Input('/Other Namespace/Other/Foundry_support_test/dataset2'),
    output=Output('/Other Namespace/Other/Foundry_support_test/dataset2'),
    )

    def compute(input_ds1, input_ds2, output):

Solution

  • This is possible to read and write to the content of the output dataset with the @incremental() decorator. With it you can read the previous version of any dataset and avoid the cyclical dependency error.

     @transform(
        input_ds1=Input('Other Namespace/Other/Foundry_support_test/dataset1'),
        output=Output('/Other Namespace/Other/Foundry_support_test/dataset2'),
     )
    
     def compute(input_ds1, input_ds2, output):
     input_ds2 = output.dataframe('previous')
    

    Incremental transform is designed for other use cases but contains a lot of features. More details in the incremental documentation: https://www.palantir.com/docs/foundry/transforms-python/incremental-reference/