Search code examples
c#unity-game-engineasynchronousrenderingrender-to-texture

how to speed up code that computes render-to-render optical flow and saves it as separate PNG files?


I'm running the C# code below, which computes optical flow maps and saves them as PNGs during gameplay, using Unity with a VR headset that forces an upper limit of 90 FPS. Without this code, the project runs smoothly at 90 FPS. To run this code on top of the same project consistently above 80 FPS, I had to use WaitForSeconds(0.2f) in the coroutine but the ideal scenario would be to compute and save the optical flow map for every frame of the game, or at least with a lower delay about 0.01 seconds. I'm already using AsyncGPUReadback and WriteAsync.

  • Main Question: How can I speed up this code further?
  • Side Question: Any way I can dump the calculated optical flow maps as consecutive rows in a CSV file so that it would write on a single file rather than creating a separate PNG for each map? Or would this be even slower?
using System.Collections;
using UnityEngine;
using System.IO;
using UnityEngine.Rendering;

namespace OpticalFlowAlternative
{

    public class OpticalFlow : MonoBehaviour {

        protected enum Pass {
            Flow = 0,
            DownSample = 1,
            BlurH = 2,
            BlurV = 3,
            Visualize = 4
        };

        public RenderTexture Flow { get { return resultBuffer; } }

        [SerializeField] protected Material flowMaterial;
        protected RenderTexture prevFrame, flowBuffer, resultBuffer, renderTexture, rt;

        public string customOutputFolderPath = "";
        private string filepathforflow;
        private int imageCount = 0;

        int targetTextureWidth, targetTextureHeight;

        private EyeTrackingV2 eyeTracking;

        protected void Start () {
            eyeTracking = GameObject.Find("XR Rig").GetComponent<EyeTrackingV2>();
            targetTextureWidth = Screen.width / 16;
            targetTextureHeight = Screen.height / 16;
            flowMaterial.SetFloat("_Ratio", 1f * Screen.height / Screen.width);

            renderTexture = new RenderTexture(targetTextureWidth, targetTextureHeight, 0);
            rt = new RenderTexture(Screen.width, Screen.height, 0);

            StartCoroutine("StartCapture");
        }

        protected void LateUpdate()
        {
            eyeTracking.flowCount = imageCount;
        }

        protected void OnDestroy ()
        {
            if(prevFrame != null)
            {
                prevFrame.Release();
                prevFrame = null;

                flowBuffer.Release();
                flowBuffer = null;

                rt.Release();
                rt = null;

                renderTexture.Release();
                renderTexture = null;
            }
        }

       IEnumerator StartCapture()
        {
            while (true)
            {
                yield return new WaitForSeconds(0.2f);
                
                ScreenCapture.CaptureScreenshotIntoRenderTexture(rt);
                //compensating for image flip
                Graphics.Blit(rt, renderTexture, new Vector2(1, -1), new Vector2(0, 1));

                if (prevFrame == null)
                {
                    Setup(targetTextureWidth, targetTextureHeight);
                    Graphics.Blit(renderTexture, prevFrame);
                }

                flowMaterial.SetTexture("_PrevTex", prevFrame);

                //calculating motion flow frame here
                Graphics.Blit(renderTexture, flowBuffer, flowMaterial, (int)Pass.Flow);
                Graphics.Blit(renderTexture, prevFrame);
                
                AsyncGPUReadback.Request(flowBuffer, 0, TextureFormat.ARGB32, OnCompleteReadback);
            }
        }

        void OnCompleteReadback(AsyncGPUReadbackRequest request)
        {
            if (request.hasError)
                return;

            var tex = new Texture2D(targetTextureWidth, targetTextureHeight, TextureFormat.ARGB32, false);
            tex.LoadRawTextureData(request.GetData<uint>());
            tex.Apply();

            WriteTextureAsync(tex);
        }

        async void WriteTextureAsync(Texture2D tex)
        {

            imageCount++;
            
            filepathforflow = customOutputFolderPath + imageCount + ".png"; 
            var stream = new FileStream(filepathforflow, FileMode.OpenOrCreate);
            var bytes = tex.EncodeToPNG();
            await stream.WriteAsync(bytes, 0, bytes.Length);
        }

        protected void Setup(int width, int height)
        {
            prevFrame = new RenderTexture(width, height, 0);
            prevFrame.format = RenderTextureFormat.ARGBFloat;
            prevFrame.wrapMode = TextureWrapMode.Repeat;
            prevFrame.Create();

            flowBuffer = new RenderTexture(width, height, 0);
            flowBuffer.format = RenderTextureFormat.ARGBFloat;
            flowBuffer.wrapMode = TextureWrapMode.Repeat;
            flowBuffer.Create();
        }
    }
}

Solution

  • First though here is to use CommandBuffers, with them you can perform no-copy readback of the screen, apply your calculations and store them in separate buffers (textures). Then you can request readbacks of part of the texture/multiple textures over frames, without blocking access to currently computing texture. When readback is performed, best way is to encode it to PNG/JPG in separate thread, without blocking main thread.

    Alternately to async readbacks, if you are on DX11/Desktop, it's also possible to have D3D buffer configured for fast cpu readback, and map it every frame if you want to avoid few-frames latency which happens because of using async readback.

    Creating texture from buffer is another waste of performance here, since readback gives you pixel values, you can use general purpose png encoders and save it multi-threaded (while texture creation is only allowed in "main" thread)

    If latencies are fine for you, but you want to have exact framenumber to image mapping, it's also possible to encode frame number into target image, so you'll always have it before saving in png.

    About side-question, CSV could be faster than default PNG encoding, because PNG using zip-like compression inside, while CSV is just a bunch of numbers compiled in strings