Search code examples
pythontensorflowmachine-learningdeep-learningtransformer-model

Tensorflow custom learning rate scheduler gives unexpected EagerTensor type error


Below is my custom LR Scheduler that subclasses tensorflow.keras.optimizers.schedules.LearningRateSchedule, got error TypeError: Cannot convert -0.5 to EagerTensor of dtype int64. Really baffled as to why Eagertensor is relevant to a simple inverse square calculation for the return call of this custom class..

class lr_schedule(tensorflow.keras.optimizers.schedules.LearningRateSchedule):
    def __init__(self, dim_embed, warmup_steps):
        self.dim_embed = dim_embed
        self.warmup_steps = warmup_steps
    def __call__(self, step):
        return (self.dim_embed ** -0.5) * min((step ** -0.5), step * (self.warmup_steps ** -1.5))

Not specifically relevant to this error, but this is a custom LR Scheduler that replicates warmup scheduler that is used at 'Attention is All You Need' paper..


Solution

  • I ran across this just yesterday. It's a type coercion issue since the value of step being passed into __call__ is int64, so the math is converting everything to int64.

    For your specific case, this should probably fix it:

    class lr_schedule(tensorflow.keras.optimizers.schedules.LearningRateSchedule):
        def __init__(self, dim_embed, warmup_steps):
            self.dim_embed = tensorflow.cast(dim_embed, dtype=tensorflow.float32)
            self.warmup_steps = tensorflow.cast(warmup_steps, dtype=tensorflow.float32)
        def __call__(self, step):
            step = tensorflow.cast(step, dtype=tensorflow.float32)
            return (self.dim_embed ** -0.5) * min((step ** -0.5), step * (self.warmup_steps ** -1.5))