Ref Return and Ref Locals in C# 7

The C# language supports passing arguments by value or by reference since the first language version. But returning a value was possible by value only. This has been changed in C# 7 by introducing two new features: ref returns and ref locals. With these new features it is possible to return by reference. ‚Ref return‘ allows to return an alias to an existing variable and ‚ref local‘ can store this alias in a local variable.

The main goal of this language extension is allowing developers to pass around references to value types instead of copies of the values. This is important when working with large data structures implemented as value types. Of course, the new feature can be used with reference types too but reference types will already be returned as pointers and you will not have advantages if you return them as reference to pointer.

Return by value

The following source code shows an example for a return by value. The function returns the second element of the list. As it is a return by value, a copy of the element will be returned. A modification of the returned element is done within the copy and not within the origin object instance.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
  new Person() {mName = "John Doe", mAge = 31 },
  new Person() {mName = "Jane Doe", mAge = 27 },
  };

  Person person = GetSecond(persons);
  person.mAge = 41;

  // output:
  // 'John Doe (31)'
  // 'Jane Doe (27)'
  foreach (Person p in persons)
  {
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

struct Person
{
  public string mName;
  public int mAge;
}

static Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return persons[1];
}

If you want to find and use the origin element you had to implement the method in a different way, for example return the found index and use this index to modify the origin list. With the new ‘ref return’ feature it becomes possible to implement the needed behavior very easily. You just must change the return from a value to a reference.

Return by reference

The following source code shows the same adapted example. This time the found list item is returned as reference. The reference is sored within the ref local variable. Changes mode to this variable are mode to the referenced object instance. So the origin list item will be changed.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'Jane Doe (41)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return ref persons[1];
}

Use a List instead of an Array

Within the previous example I used an array of struct objects. What do you think will happen if we change to another container type, for example to a list?

static void Main(string[] args)
{
  List persons = new List
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(List persons)
{
  if (persons.Count < 2) throw new ArgumentException();

  // error CS8156: An expression cannot be used in this context
  // because it may not be passed or returned by reference
  return ref persons[1];
}

It isn’t longer possible to compile the code. The ‘return ref persons[1]’ statement results in an compiler error. That’s because of a different implementation of the array indexer and the list indexer. The array indexer returns a reference to the list item. So, we can use this reference as return value. The list indexer instead returns a copy of the value. As it isn’t allowed to return the indexer expression itself nor the returned temporary local variable, the compiler will show an according error message. Within the following article you can find further information about this issue.

Ref local

Within the example application we have stored the returned reference within a ‘ref local’ variable. A ref local variable is an alias to the origin object instance. It is initialized by the ref return value. The reference itself is constant after this initialization. Therefore, an assignment to a ref local will not change the reference but it will change the content of the referenced object.

The following source code shows an according example.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref persons[1];
  person.mAge = 41;
  person = persons[0];
  person.mAge = 51;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'John Doe (51)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

At first the ref local is initialized with a reference to the second list item. Then the age is changed to 41. Now we have another assignment which is very interesting. ‘persons[0]’ returns a reference to the first list item. But an assignment to our ref local variable will not change the reference. The reference is set during initialization and stays constant. The assignment will change the value of the referenced object. Therefore, the second list item – which is referenced by the ref local variable – will be changed to the values which are stored within the first list item, which are ‘John Doe’ and ‘31’. At next we set the age to ‘51’. So, the output is like shown in comment within the example application. The first list item is not changed at all and the second list item was updated with the name stored within the first item and an age set due to the last assignment.

Ref vs. Pointer

The previous example and this topic in general may raise the question about the difference between a reference and a pointer. So, we should take a minute and think about this question as it is important for a deep understanding of the ref local and ref return mechanism.

Briefly summarized you can say: References and pointers do both refer to an object instance. But references are constant after initialization where pointers can be changed.

This single difference is an important one as it results in different characteristics and possibilities of the two concepts. Following I want to mention some important ones. As references are constant they cannot be null. Pointers instead can be reassigned and as consequence they can also set to null. You can have pointers to pointers and create extra levels of indirection, whereas references only offer one level of indirection. As pointers can be reassigned, various arithmetic operations can be performed on them, which is called ‘pointer arithmetic’. It is easier to work with references as they cannot be null and you don’t have to think about indirection. But it is not safer to work with references because pointers as well as references can refer to invalid objects or memory locations.

Please look at the following functions and think about the parameter kind: is it a reference or a pointer or a value copy?

Function Parameter kind
void Foo(MyStruct x) Copy of the value passed by the caller
void Foo(MyClass x) Pointer to the origin object instance which is available on caller level
void Foo(ref MyStruct x) Reference to the origin object instance which is available on caller level
void Foo(ref MyClass x) Reference to the pointer to the origin object instance which is available on caller level   Often C# developers say that’s a “pointer to a pointer” but in fact it is a reference to a pointer. As you know after reading above comparison of the two concepts that’s a small but important difference.

Use ref return without ref local

You can define a method with ref return and assign the result to a variable which is not ref local. In this case the content which is referenced by temporary variable of the method result will be copied to the local variable. So your local variable is a copy of the origin list item and changes will therefore not affect the list item.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  Person person = GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'Jane Doe (27)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return ref persons[1];
}

Ref return of method local variable

A ref return creates a reference to an existing object instance. If the lifetime of the object instance is shorter than the lifetime of the reference, the reference will refer to an invalid object or memory location. This will result in critical runtime errors. Therefore, the referred object instance must be in a higher scope or in the same scope as the ref local variable. You cannot create a method local object instance and return a reference to this object instance. The following source code shows an according example with the compiler error messages as comment.

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2)
  {
    // error: 
    // An expression cannot be used in this context 
    // because it may not be passed or returned by reference
    return ref new Person();

    // error:
    // Cannot return local 'person' by reference because it is not a ref local
    Person person = new Person();
    return ref person;
  }

  return ref persons[1];
}

Return null

As we already learned, a reference cannot be null. Therefore, a method with ref return cannot return null. Within the examples so far, we have thrown an error if the parameter is invalid. But exceptions should be thrown in exceptional cases only. In my opinion it is not an exceptional case if we pass a list with less than two elements to the ‘GetSecond’ method. So, I don’t want to throw an exception but as no list item is found I want to return an invalid element. For reference types I want to return null and for value types I want to return a default value. But as we have seen, whether it is possible to create a local default value and return it by reference nor it is possible to return null. But it is possible to return a reference to an object instance if the object is in higher scope. We can use this possibility and define a default value for an invalid list item.

The following source code shows an according example with a list of value types.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
    new Person() {mName = "John Doe", mAge = 31 }
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;
}

static Person gDefaultPerson = new Person();

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) return ref gDefaultPerson;

  return ref persons[1];
}

And we can adapt the example for a list of reference types.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
        new Person() {mName = "John Doe", mAge = 31 }
  };

  ref Person person = ref GetSecond(persons);

  if (person != null) person.mAge = 41;
}

static Person gDefaultPerson = null;

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) return ref gDefaultPerson;

  return ref persons[1];
}

But there is one very critical design fault within this implementation. The returned default value can be changed. So, a next method call or a method call by another client may return a default value with changed content. This may lead to undefined behavior in the application. But what if we use an immutable object for the default value? This will solve the issue and allows to use this implementation concept. So, you must implement an immutable object and return a reference to this constant object instance. With C# 7.2 it will be possible to use the readonly modifier for structs and ref returns. This will make it even more comfortable to create and use immutable structs.

‘In’ modifier and ‘Readonly’ modifier for struct und ref return

The code examples of this article were created with C# 7.0. With C# 7.2 you can use two additional features which allows to write more performant code. These features are the ‘in’ modifier for method parameters and the ‘readonly’ modifier for ref returns and for structs.

Method parameters are often used as input for the method. So, they will not be changed within the method. If you use a struct as method parameter it is passed by value. In this case the runtime creates a copy of the struct instance and pass the copy to the method. This language design concept allows to use the method parameter as method local value without any side effect to the origin struct instance outside of the method scope. But of course, this comes with the disadvantage of performance loss as it may be expensive to create the copy of the struct.

But do we need a copy at all if we just read the parameter values? Of course not! In this case it would be fine to pass a reference to the origin struct. But it must be guaranteed that it is used to read values only. This is exactly the idea of the ‘in’ modifier. As well as the ‘out’ and ‘ref’ modifiers, the parameter will be passed as reference. Additional a ‘in’ parameter will become read only. So, you cannot assign a new value to the parameter. This is comparable to the “pass by const reference” principle in C++.

In theory the ‘in’ modifier is a nice and easy way to improve the performance of method calls with struct parameters. But unfortunately, it isn’t that easy. Depending on the implementation of the struct the compiler must create a copy of the parameter even if you use the ‘in’ modifier. This procedure is called ‘defensive copy’. It is used in case the compiler cannot guarantee that the parameter will not be changed inside the method. Of course, the compiler can prevent direct assignments. But if you call a struct method, the compiler may not know if the member method changes the internal state of the struct. In such situations the defensive copy is created.

To prevent a creation of a defensive copy you can implement an immutable struct. In this case you must use the ‘readonly’ modifier for the class declaration. A readonly struct cannot be changed. Even member methods cannot change the internal state. If you pass such a readonly struct instance as in parameter to a method the compiler knows that the value stays constant and does not have to create a defensive copy.

The ‘readonly’ modifier is moreover available for the ref return value of a method. Of course, the reference itself is constant by definition, so this readonly modifier means that the referenced object instance is constant.

Summary

Ref returns and ref locals help to write more performant code as there’s no need to move copies of values between methods. These enhancements are designed for performance critical algorithms where minimizing memory allocations is a major factor. For the same reason the ‘in’ modifier and the ‘readonly’ modifier for structs were introduced. To pass readonly structs as in parameters to methods may increase the application performance.

Dieser Beitrag wurde unter .NET, C# veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden /  Ändern )

Google Foto

Du kommentierst mit Deinem Google-Konto. Abmelden /  Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden /  Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden /  Ändern )

Verbinde mit %s