Search code examples
pythonkerasloss-functionquantilerisk-analysis

(Custom) Percentile MSE Loss function


I have a Keras model that has inputs x_1,...,x_n and d-dimensional outputs f(x_1),...,f(x_n). I'm working on a regression problem with d-dimensional targets y_1,...,y_n.

I would like to minimize the loss-function: For a fixed meta-parameter a between 0 and 1, return the a^th (empirical) quantile of |f(x_i)-y_i|^2.

Here is what I have coded so far:

def keras_custom_loss(y_true,y_predicted):
    SEs = K.square(y_true-y_predicted)
    out_custom = tfp.stats.percentile(SEs, 50.0, interpolation='midpoint')
    return out_custom

One issue is that I'd like to avoid using tensorflow_probability and I would like the entire implementation done on Keras.

However, I can't figure out how.


Solution

  • For taking "all elements" above that percentile, you will need a different answer:

    import keras.backend as K
    from keras.layers import *
    from keras.models import Model
    import numpy as np
    import tensorflow as tf
    
    def above_percentile(x, p): #assuming the input is flattened: (n,)
    
        samples = K.cast(K.shape(x)[0], K.floatx()) #batch size
        p =  (100. - p)/100.  #100% will return 0 elements, 0% will return all elements
    
        #samples to get:
        samples = K.cast(tf.math.floor(p * samples), 'int32')
            #you can choose tf.math.ceil above, it depends on whether you want to
            #include or exclude one element. Suppose you you want 33% top,
            #but it's only possible to get exactly 30% or 40% top:
            #floor will get 30% top and ceil will get 40% top.
            #(exact matches included in both cases)
    
        #selected samples
        values, indices = tf.math.top_k(x, samples)
    
        return values
    
    def custom_loss(p):
        def loss(y_true, y_predicted):
            ses = K.square(y_true-y_predicted)
            above = above_percentile(K.flatten(ses), p)
            return K.mean(above)
        return loss
    

    Test:

    dataX = np.array([2,3,1,4,7,10,8,5,6]).reshape((-1,1))
    dataY = np.ones((9,1))
    
    
    ins = Input((1,))
    outs = Lambda(lambda x: x)(ins)
    model = Model(ins, outs)
    
    model.compile(optimizer='adam', loss = custom_loss(70.))
    model.fit(dataX, dataY)
    

    Loss will be 65 which is 130/2 (mean). And 130 = (10-1)² + (8-1)², being 10 and 8 the two top k in the input.