tensorflow google-cloud-platform pytorch nvidia

Real-ESRGAN performace on Nvidia Tesla T4

I am running a Google Cloud Platform (GCP) Computing engine with 2x NVidia Tesla T4 , running 2 separate threads of image upscaling 4 times.

I have the following shell script for that:

python ~/Real-ESRGAN/inference_realesrgan.py -i ~/lowres/ -o ~/upscaled/  -dn 0 -t 512 -g 0

My upscaler is built based on this repository.

The performance is good, but I have a long waiting queue for the images (they are being dynamically uploaded there.

Is there something that could improve the proessing speed?

I've added the -t 512 (tiles = 512) parameter, as the Nvidia Tesla cannot upscale the whole image 2048x2048px at once, it gets Out of memory error.

Can I tweak somehow my script? I cannot stop the whole upscaler and experiment with the time measurement, but in global, lowering the tile size will help? Or is there another parameter that could speed up the upscaling?

I have 2 GPUs, they are using two different folders by utilizing 2 separate shell scripts. My original goal was to do an upscaling by using both of GPUs for one image, but it is not working, however the help for the arguments telling that it can be used.

So my main question is, that increasing or decreasing the tile size would improve the speed, and the additional question is, if there is some additional parameter which could speed up the processing.

Thanks for your answers in advance!

Solution

You might find this documentation on optimizing gpus helpful. It states to optimize, you can use higher network bandwidth speeds on VMs that use NVIDIA A100, T4, L4, or V100 GPUs. Regarding your question about increasing or decreasing the tile size if it would improve the speed, it depends on the image size and the GPU memory. If you have enough GPU memory, you can increase the tile size to reduce the number of tiles and speed up the processing. Otherwise, you can decrease the tile size to fit into GPU memory and avoid out-of-memory errors.