When FeatureTools performs deep feature synthesis, is there a way for it to record constant values it has derived?
For example, I have a with many rows like this:
| loan_id | loan_term |
|---------|:---------:|
| a | 12 |
| ... | ... |
| z | 18 |
DeepFeatureSynthesis engineers features
including <Feature: loan_term.COUNT(loan)>
as so:
| loan | loan_term | loan_term.COUNT(loan) |
|---------|:---------:|:---------------------:|
| a | 12 | 2000 |
| ... | ... | ... |
| z | 18 | 800 |
I would like to be able to re-engineer features from a single entity, so that a single loan term of 12
has a loan_term.COUNT(loan)
of 2000
without having to re-count all of the loan_term
s in the dataframe.*
I could do this by re-combining the entity with with training data
ft.calculate_feature_matrix(features, my_entity_set_with_one_new_entity_added)
, but this is inefficient and slow.
Is there a way to direct FeatureTools to record constants found during deep feature synthesis, and to use them for future feature generation?
*It's not important to me right now to include the single new loan entity in the calculation. So 12
does not have to become 2001
.
Unfortunately, there is not a way to do this at this as of Featuretools v0.3.1
. You can accomplish this manually by doing the following.
loan_term.COUNT(loan)
. loan_term
. You may have to make some tweaks based on the particulars of your dataset.