I want to optimize my model using tensorRT, However, the CTC layer in my model is not supported by tensorRT. Have someone succeeded in optimizing CTC layer with tensorRT.
I have managed to do this in two steps, firstly using TensorRT to get the probability logits, and then using a C++ CTC decoder to decode the logits.
As the decoding of CTC is suitable for CPU, and also I manage to use GPU to get the batches of logits and enqueue each batch of logit to a CPU CTC decoding queue while the GPU is running. So in this way, the CPU and GPU can run parallelly. A C++ implementation of the CTC decoder can be found in Github or from the Tensorflow repository.