Search code examples
pythondeep-learningfeature-engineeringfeaturetools

AttributeError: Cutoff time DataFrame must contain a column with either the same name as the target dataframe index or a column named "instance_id"


I'm learning how to use Featuretools with this tutorial and I've made it to a snippet which is right below this paragraph:

from featuretools.tsfresh import CidCe
import featuretools as ft

fm, features = ft.dfs(
    entityset=es,
    target_dataframe_name='RUL data',
    agg_primitives=['last', 'max'],
    trans_primitives=[],
    chunk_size=.26,
    cutoff_time=cutoff_time_list[0],
    max_depth=3,
    verbose=True,
)

fm.to_csv('advanced_fm.csv')
fm.head()

The variable es holds the dataset:

enter image description here

Then, the cutoff_time_list[0] has a table which looks like this:

enter image description here

However, I get this error: AttributeError: Cutoff time DataFrame must contain a column with either the same name as the target dataframe index or a column named "instance_id" Even when the dataframe and the cut-off table both have an "engine_no" column. Why is this caused? I'm using the version 1.27.0 of featuretools.


Solution

  • The error message here is really pointing you in the right direction. It says that the cutoff time dataframe must have a column with the same name as the index column of the target dataframe. In your example, you set the target dataframe to RUL data, but the index of that dataframe is set to the column named index. Since there is no column in the cutoff time dataframe named index you are getting this error.

    Based on the example notebook you have linked, I think you really want to have target_dataframe_name set to the normalRUL dataframe, which has its index set as engine_no.