Search code examples
pythonmachine-learningkerasloss-function

Implementing custom objective function in keras


I am trying to implement my own cost function, specifically the one below:

enter image description here

Now I know this question has been asked several times on this site and the answers I read are typically something like below:

def custom_objective(y_true, y_pred):
....
return L

where people always seem to use y_true and y_pred and then say that you just have to compile the model model.compile(loss=custom_objective) and go from there. No one really mentions that somewhere in the code that y_true=something and y_pred=something. Is that something I have to specify in my model?

My code

Not sure if I am using .predict() correctly to get the running predictions from the model as it is training:

params = {'lr': 0.0001,
 'batch_size': 30,
 'epochs': 400,
 'dropout': 0.2,
 'optimizer': 'adam',
 'losses': 'avg_partial_likelihood',
 'activation':'relu',
 'last_activation': 'linear'}

def model(x_train, y_train, x_val, y_val):

    l2_reg = 0.4
    kernel_init ='he_uniform' 
    bias_init ='he_uniform'
    layers=[20, 20, 1]

    model = Sequential()

    # layer 1
    model.add(Dense(layers[0], input_dim=x_train.shape[1],
                    W_regularizer=l2(l2_reg),
                    kernel_initializer=kernel_init,
                    bias_initializer=bias_init))


    model.add(BatchNormalization(axis=-1, momentum=momentum, center=True))

    model.add(Activation(params['activation']))

    model.add(Dropout(params['dropout']))

    # layer 2+    
    for layer in range(0, len(layers)-1):

        model.add(Dense(layers[layer+1], W_regularizer=l2(l2_reg),
                        kernel_initializer=kernel_init,
                        bias_initializer=bias_init))


        model.add(BatchNormalization(axis=-1, momentum=momentum, center=True))

        model.add(Activation(params['activation']))

        model.add(Dropout(params['dropout']))

    # Last layer
    model.add(Dense(layers[-1], activation=params['last_activation'],
                    kernel_initializer=kernel_init,
                    bias_initializer=bias_init))

    model.compile(loss=params['losses'],
                  optimizer=keras.optimizers.adam(lr=params['lr']),
                  metrics=['accuracy'])

    history = model.fit(x_train, y_train, 
                        validation_data=[x_val, y_val],
                        batch_size=params['batch_size'],
                        epochs=params['epochs'],
                        verbose=1)

    y_pred = model.predict(x_train, batch_size=params['batch_size'])

    history_dict = history.history

    model_output = {'model':model, 
                    'history_dict':history_dict,
                    'log_risk':y_pred}

    return model_output

then create the model:

model(x_train, y_train, x_val, y_val)

My objective function thus far

'log_risk' would be y_true and x_train would be used to calculate y_pred:

def avg_partial_likelihood(x_train, log_risk):



    from lifelines import CoxPHFitter

    cph = CoxPHFitter()

    cph.fit(x_train, duration_col='survival_fu_combine', event_col='death',
           show_progress=False)

    # obtain exp(hx)

    cph_output = pd.DataFrame(cph.summary).T

    # summing hazard ratio

    hazard_ratio_sum = cph_output.iloc[1,].sum()

    # -log(sum(exp(hxj)))

    neg_log_sum = -np.log(hazard_ratio_sum)

    # sum of positive events (death==1)

    sum_noncensored_events = (x_train.death==1).sum()

    # neg_likelihood

    neg_likelihood = -(log_risk + neg_log_sum)/sum_noncensored_events

    return neg_likelihood

Error if I try to run

  AttributeError                            Traceback (most recent call last)
<ipython-input-26-cf0236299ad5> in <module>()
----> 1 model(x_train, y_train, x_val, y_val)

<ipython-input-25-d0f9409c831a> in model(x_train, y_train, x_val, y_val)
     45     model.compile(loss=avg_partial_likelihood,
     46                   optimizer=keras.optimizers.adam(lr=params['lr']),
---> 47                   metrics=['accuracy'])
     48 
     49     history = model.fit(x_train, y_train, 

~\Anaconda3\lib\site-packages\keras\engine\training.py in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, **kwargs)
    331                 with K.name_scope(self.output_names[i] + '_loss'):
    332                     output_loss = weighted_loss(y_true, y_pred,
--> 333                                                 sample_weight, mask)
    334                 if len(self.outputs) > 1:
    335                     self.metrics_tensors.append(output_loss)

~\Anaconda3\lib\site-packages\keras\engine\training_utils.py in weighted(y_true, y_pred, weights, mask)
    401         """
    402         # score_array has ndim >= 2
--> 403         score_array = fn(y_true, y_pred)
    404         if mask is not None:
    405             # Cast the mask to floatX to avoid float64 upcasting in Theano

<ipython-input-23-ed57799a1f9d> in avg_partial_likelihood(x_train, log_risk)
     27 
     28     cph.fit(x_train, duration_col='survival_fu_combine', event_col='death',
---> 29            show_progress=False)
     30 
     31     # obtain exp(hx)

~\Anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in fit(self, df, duration_col, event_col, show_progress, initial_beta, strata, step_size, weights_col)
     90         """
     91 
---> 92         df = df.copy()
     93 
     94         # Sort on time

AttributeError: 'Tensor' object has no attribute 'copy'

Solution

  • No one really mentions that somewhere in the code that y_true=something and y_pred=something ...

    They don't mention it because you don't need to do that! Actually, at the end of each pass (i.e. forward propagation on one batch), Keras feeds y_true and y_pred using the true labels and predictions of the model for that pass. Therefore, you don't need to define y_true and y_pred in your model at all. Just define your loss function using the backend functions (i.e. from keras import backend as K) and everything would work fine (and never use numpy in your loss function). To get an idea, take a look at the built-in loss functions in Keras and see how they have been implemented. And here is a (possibly incomplete) list of available backend functions.