Search code examples

Creating a primary key data health expectation in Palantir Foundry Code Repositories

I have a dataset that is the output of a Python transform defined in Palantir Foundry Code Repository. It has a primary key, but given that over time the data may change I want to validate this primary key holds in the future.

How can I create a data health expectation or check to ensure the primary key holds in future?


  • You can define data expectations in your Python transform, for example:

    from transforms.api import transform_df, Input, Output, Check
    from transforms import expectations as E
        Output("/path/to/output", checks=[
            Check(E.primary_key("thing_id"), "primary_key: thing_id"),
    def compute(source_df):
        return"thing_id", "thing_name").distinct()

    More information is available in the Palantir Foundry documentation on defining data expectations.