Search code examples
tensorflowdeep-learningquantizationtensorflow-lite

How does tflite fuses Relu into conv layers?


When converting TF model to tflite model (or in other words - quantize a model using "post-training quantization"), the Relu layers disappears from the graph. This is explained in the documentation: "operations that can be simply removed from the graph (tf.identity), replaced by tensors (tf.placeholder), or fused into more complex operations (tf.nn.bias_add)."

My question is - how can a Relu layer be fused into a prior layer? (What is the math beyond this "fusion"? Is this a specific procedure to quantized models, or this can be done also in the original floating-point model?)


Solution

  • Hope that helps.