Search code examples
data-sciencefeaturetools

Unable to add relationship in featuretools entity set


New to feature tools, getting this error while creating entity

Unable to add relationship because child column 'order_id' in 'orders' is also its index

I suspect that featuretools expect one to many relationship, is there a way to specify one to one relationship?


Solution

  • Yes, Featuretools generally expects a one to many relationship between tables in an EntitySet, which is why the child column cannot be the index of its table.

    There's not a way to override this in relationship creation, but you can take steps to use a different index column in the child dataframe, allowing order_id to be the child column of the relationship.

    You could create a new index column in prejoin_foodorder by setting make_index=True and the index to be some column name that's not in the DataFrame when adding the table to the EntitySet. This will create a new integer column in the DataFrame that ranges from 0 to the length of the dataframe. That column will then be used as the DataFrame's index, leaving order_id to be used as the child column of a relationship.

    es = EntitySet()
    ... add any other dataframes to the EntitySet ...
    es.add_dataframe('prejoin_foodorder', index='new_index', make_index=True, ...)
    es.add_relationship(parent_dataframe_name='orders', 
                        parent_column_name='id', 
                        child_dataframe_name='prejoin_foodorder', 
                        child_column_name='order_id')