Search code examples
c#unity-game-enginegarbage-collectionparameter-passinggame-development

C# - Correct way to pass parameters to avoid GC stutters?


(I found some related questions, but they were not exactly same or a decade old.)

So, are parameters in C# passed as value or references.

say I do,

static void main()
{
    string name = "hello world";
    Console.WriteLine(name);
    testfunc(name);
    Console.WriteLine(name);
}

void testfunc(string name)
{
    name = "stackoverflow";
    Console.WriteLine(name);
}

#Output
=> hello world
=> stackoverflow
=> hello world

so, in this parameter is passed as a value and no longer associated with original variable (in memory). I guess this means the variable is copied to new location in memory and then passed (which involved allocation, copying, and then passing to function).

Then we do this:

static void main()
{
    string name = "hello world";
    Console.WriteLine(name);
    testfunc(ref name);
    Console.WriteLine(name);
}

void testfunc(ref string name)
{
    name = "stackoverflow";
    Console.WriteLine(name);
}

#Output
=> hello world
=> stackoverflow
=> stackoverflow

So, in this I guess the original variable reference/pointer was passed into the function and no allocation or copying of new memory/data took place.

So, I was thinking whether the first method has some overhead (it surely must have for data >= 10MB). But then the second method is unsafe as it can potentially corrupt the original data.

so, what should we do or what kind of optimization can we put in place to overcome the overhead of the first method (if there is any)? Also same is valid for values returned from the function.

**My Use Case: ** I am developing a game in unity. I have to generate a world. For that, I create 3D noise maps (very large and also multiple for humidity, civilization, terrain, etc.). They take up total of about 70-80 MBs of memory. Now, I pass them all to a function where they are combined, and the final world is generated. So, if 100 MB of data is copied in memory (like in the first method above) then I don't think it will be very good with machines having <= 4GB of ram. As for the data returned from a function, let's say I have a save file of 25 MB (some games like RimWorld have such large save files). I call the loadSaveFile() function (it reads the file, casts it to SaveClass and stores it in a variable saveData, and returns it). this saveData variable should be about 25-28 MB in size. So, if the data returned is duplicated too, then it is bad for such large data.

I know the duplicated data will be eventually destroyed by GC but that will stutter the frame rate for that particular frame when GC destroys it. This can be solved by using incremental GC but still the memory inflation (for those few seconds from the duplicated data being created and destroyed will affect the overall performance of machine).

So, the final question is: What is the correct, best, and most practical way to pass big data as parameters to a function and also receive big data as returned value?


Solution

  • It does not really matter for the GC what you pass as parameters. Parameters are passed on the stack, either as values or references, and this is cleaned up automatically.

    What you want to avoid is allocating large short lived objects, typically large arrays.

    The first step should be to profile your application. Do you have GC stutters? If not, stop worrying about it. If you have problems, use profilers and/or benchmarks to check the why. Chances are the few objects you create have a very minimal impact. But some rules of thumbs:

    1. Use value types for small objects, i.e. struct. Make them read only, and if larger than 8 bytes, consider passing them with the in prefix. This passes the value by reference, but prevents any changes.
    2. When working with collections, use Span<T> when you can. This allow for great flexibility while keeping performance.
    3. Objects should live ideally forever, i.e. until there is a good opportunity for a Gen 2 GC. Or a short while to be efficiently collected in gen 0/1, but if that is the case, consider using value types instead.
    4. Avoid boxing, i.e. using object to reference a value type.
    5. You can also use structs for larger and mutable objects, but then you really need to pay attention to the exact language rules to avoid copies and poor performance. This will likely lead to code that is more difficult to read.
    6. Optimizations usually make the code more difficult to read, so reserve them for the places that actually need optimization.

    void testfunc(ref string name)

    Please do not do this, it has no advantage, and hints that you will may be replacing the value by something else, and that should be fairly rare thing to do. "Pure" methods tend to be easier to read and understand.