Search code examples
c#parameter-passingref

c# passing class instance as null not done as ref


Seems like any C# question must be duplicate, but I could not find it.

Coding in C#:

MyClassType mt = null;
dofunction(mt);
// mt was modified in dofunction to some non null value, but comes back still null

dofunction is something like

public void dofunction(MyClassType mt){
mt="xxxxx";
}

To get the update to be seen in the caller, I have to use the ref keyword. Why? I thought class instances were always passed as ref, without need of the ref keyword. MyClassType tells dofunction that mt is a class instance, but it doesn't act like it. Setting mt=null is not a class instance, I guess, but so what?

Again, the question is "Why?".

I've edited ..........

Rephrase question, which seems different than so called duplicates: If you set mt to null before the call, it acts like you have NOT used the ref keyword. If you set mt to new MyClassType() before the call, it acts as if you have used the ref keyword. Why would C# act in this convoluted way?

Here is example code that demonstrates my premise. I am writing C# in Visual Studio 2019, ASP.Net Core 5.

Here is calling routine:

public IActionResult RefTest() {
            MrShow m1 = null; // in this case doFunction acts as if m1 does NOT have the ref keyword
            MyProject.Utils.TestIt.doFunction(m1);
            string m1contents = "result when m1 set to null: ";
            if (m1 == null) m1contents += "null / ";
            else m1contents += m1.idnmbr + " / ";
            m1contents += "result when m1 set to class instance: ";
            m1 = new MrShow(); // in this case doFunction acts as if m1 does have the ref keyword
            MyProject.Utils.TestIt.doFunction(m1);
            if (m1 == null) m1contents += "null / ";
            else m1contents += m1.idnmbr + " / ";
            ViewBag.m1contents = m1contents;
            return View();
        }

and here is what is called:

static internal void doFunction(MrShow m1) {
            if (m1 != null) {
                m1.idnmbr = "doFunction changed this";
            }
            else {
                m1 = new MrShow();
                m1.idnmbr = "m1 did not change this";
            }
        }

In my web app, I get result:

RefTest Results
result when m1 set to null: null / result when m1 set to class instance: doFunction changed this / 

Solution

  • This is an often talked about and quite confusing aspect of C#. I think the chief form of confusion arises because we talk about "pass by reference" and "pass by value" and values are copies.. and those terms lead people to think that in some cases an object's data is copied and in other cases the original data is passed.

    Reference types (classes) are always "passed by a reference" and when I say this I mean it's a contraction of "passed to a method by sending a reference to the data in as the method argument", but the crucial thing for what the method can do is essentially whether they are "passed by giving the original reference" or "passed by giving a copy of the reference"

    The default is "copy"; you make a new class and it's data goes somewhere in memory. As part of making it you created a variable to refer to it. Then you passed it to a method, and by default another independent variable is created that refers to the same data in memory. As such, you can change anything you like about the data, but you could only make the earlier created variable refer to an entire different object if that earlier variable were itself passed. Because by default another variable is created, attached and passed if you make that new variable refer to something else then the earlier variable is not affected. In either "copy" mode or "original" mode you can modify some property of the object

    When the C# world says "passed by reference" (original) or "passed by value" (copy) they are talking about what happens to the variable that refers to the data that makes up the object; the variable is either sent original or a copy/additional is sent. They aren't talking about the object's data - there's only one blob of data with a reference type, with N number of variable references to it

    I tend to explain it as taking your dog for a walk; there is one dog, just as there is one object in memory. When you call a method it's like letting your friend come along to walk the dog too and when they say "hey, can I lead him for a while?" you choose whether to give that person the original lead you're holding (ref keyword) or alternatively attaching a brand new lead to the same dog (so it has two leads) and giving the new lead to the other person (no keyword). The dog isn't cloned; there is only ever one dog. The lead is a reference to the dog; you hold the lead, not the dog. You steer the dog via the lead. Leads are always attached to the dog, not another lead. There isn't a chain

    If that person takes their lead and attaches it to a whole new dog they found roaming around in the park, your lead is still attached to your dog. Their actions don't change which dog your lead is connected to. If you handed over your original lead and they attached it to a new dog, your dog is lost and when control comes back to you, you find out you have a poodle, not an Alsatian.

    If there was no newing involved it wouldn't matter whether your friend used their new lead or your original lead to walk the dog to the grooming parlor and have it shaved; in either case they have modified some aspect of your dog and you see it as shaved when you get it back

    ref thus purely dictates whether a method can replace a passed-in object with a new one and the calling method will see the change.

    Try not to use it; if your method is intended to make new objects it should return them rather than surprising the caller with "hey, I swapped your object for a new one"

    Your dofunction should be like:

    public MyClassType dofunction(){
      return new MyClassType() { SomeProperty = "xxxxx"};
    }
    

    We don't code like "here, have this null thing and set it to an instance of something for me" - we code like "give me a new thing and I will update my null thing if I want to"

    MyClassType mt = dofunction();
    

    If you get to a situation where "I have to use ref because I want to return two things" you can still avoid using it - make another class to hold the two things you want to return and return an instance of that. There are even built in ways to quickly return a pair or more of things without having to make classes just for them:

    public (MyClassType X, string Y) dofunction(){
      return (new MyClassType() { SomeProperty = "xxxxx"}, "hello this is Y we love c#");
    }
    

    The compiler will effectively write this class for you; it uses a ValueTuple behind the scenes

    var result = dofunction();
    MyClassType mt = result.X;
    string secondThing = result.Y;
    

    Edit: ok, so you've posted the experiment that led you to conclude that "there is a difference in behavior depending on if the argument passed is null or not"

    First, i want to point out that the method actually has logic that makes it behave differently depending on whether the argument is null or not

    This means you're writing code that behaves differently, and observing a result and going "oh! C# is behaving like ref sometimes and not others!"

    No, it isn't; C# is being consistent. You're being inconsistent

    And you're shaving the dog. I'll illustrate with pictures. I renamed the argument to the function to m1df to help tell things apart:

        //your method
        static internal void doFunction(MrShow m1df) {
            if (m1df != null) {
                m1df.idnmbr = "doFunction changed this";
            }
            else {
                m1df = new MrShow();
                m1df.idnmbr = "m1 did not change this";
            }
        }
    
        //your code
        MrShow m1 = null; // in this case doFunction acts as if m1 does NOT have the ref keyword
        MyProject.Utils.TestIt.doFunction(m1);
    

    enter image description here

    Let's go again, this time with a non null argument:

    enter image description here

    None of this is anything to do with ref; you don't need ref to make edits to a passed object (setting idnmbr to a string) survive after the method is over. You need ref to make wholesale replacements of the entire object (use of new keyword to instantiate a new instance) survive

    => You can always shave the dog, because passing is always by reference (to the single pile of data that makes up the instance). If passing caused a copy of the entire dog to be created, you could never shave the original dog. Passing is always by reference and the reference is duplicated unless ref is specified..