Search code examples
pythontensorflowmachine-learningfeature-extractiontensorflow-estimator

How to create a tf.feature_column by multiplying two other tf.feature_columns?


In Tensorflow there is already a function to create feature by crossing columns tf.feature_column.crossed_column , but it is more for category data. How about numeric data?

For example, there are 2 columns already

age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")

if i want to create a third and fourth feature columns base on age and education_num like this

my_feature = age * education_num
my_another_feature = age * age

How can it be done?


Solution

  • You can declare a custom numerical column and add it to the dataframe in your input function:

    # Existing features
    age = tf.feature_column.numeric_column("age")
    education_num = tf.feature_column.numeric_column("education_num")
    # Declare a custom column just like other columns
    my_feature = tf.feature_column.numeric_column("my_feature")
    
    ...
    # Add to the list of features
    feature_columns = { ... age, education_num, my_feature, ... }
    
    ...
    def input_fn():
      df_data = pd.read_csv("input.csv")
      df_data = df_data.dropna(how="any", axis=0)
      # Manually update the dataframe
      df_data["my_feature"] = df_data["age"] * df_data["education_num"]
    
      return tf.estimator.inputs.pandas_input_fn(x=df_data,
                                                 y=labels,
                                                 batch_size=100,
                                                 num_epochs=10)
    
    ...
    model.train(input_fn=input_fn())