Auto Type Deduction in Range-Based For Loops

Range-based For Loops offer a nice way to loop over the elements of a container. In combination with Auto Type Deduction the source code will become very clean and easy to read and write.

for (auto element : container) 
{ 
    // ...
}

The Auto Type Deduction is available in different variants:

  • auto
  • const auto
  • auto&
  • const auto&
  • auto&&
  • const auto&&
  • decltype(auto)

Of course, these variants result in different behaviors and you should choose the right one according to your needs. Following I will give a short overview of these variants and explain their behavior and standard use case.

auto

This will create a copy of the container element. This variant is used in case you want to get and modify the element content, for example to pass it to e function, but leave the origin container element as it is.

const auto

Like in the first variant, this creates a copy of the element content. But this time this copy is constant and cannot be changed. In most cases “const auto” isn’t a good choice. If you want to work with an immutable copy you can use “const auto&” too and don’t have to create a copy. There are a few use cases for this variant only. For example, it could be useful in multithreading scenarios. Let’s say you want to use the element several times within the loop. But in parallel another thread may change the container element. By using “const auto” you create a copy of the element and can use this copy in your loop several times. If you use “const auto&” you will get the updated element instead. So, there are scenarios where “const auto” and “const auto&” create different results. Therefore, we have a need for “const auto” even if it is used very rarely.

auto&

This will create a reference to the original container element. So, it is used in case you want to modify the container content.

const auto&

This creates a constant reference to the original container element. So that’s the perfect choice if you need read-only access to the elements.

auto&&

Like “auto&” this variant with double “&” is used in case you want to modify the origin container elements. There are some special cases where it isn’t possible to use the normal “auto&” variant. For example a loop over “std::vector<bool>” yields a temporary proxy object, which cannot bind to an lvalue reference (auto&). In such cases “auto&&” can be used. It is a forwarding reference. If it is initialized with an lvalue, it creates an lvalue reference and if it is initialized with an rvalue, it creates an rvalue reference. As a result, “auto&&” is a good candidate for generic code and therefore it is most often used in templates.

Of course, as the forwarding reference “auto&&” covers the use cases of the standard reference “auto&” we may ask ourselves if we should ever use the variant with the double “&”. But I would not recommend this. The syntax with double “&” is more confusing. Developers are familiar with the standard reference syntax and will expect the forward reference syntax in special cases only. So, I recommend using “auto&” outside of templates and “auto&&” within templates.

const auto&&

This creates a read-only forwarding reference. This variant will bind to rvalues only. A read-only access will work for containers which yield a temporary proxy object. So in contrast to a read-access we don’t have to use the double “&” variant for containers like “std::vector<bool>”. There are only a view theoretical use cases for “const auto&&” and therefore you will normally not use this variant for your applications.

decltype(auto)

This variant should not be used in Range-Based For Loop. “decltype(auto)” is primarily useful for deducing the return type of forwarding functions. Use it to declare a local variable is an antipattern. Therefore, I don’t want to get in detail about “decltype” and just mentioned it for completeness. Even if the compiler allows to write “decltype(auto)” you should not use it in Range-Based For Loops.

Summary

  • Use “auto” when you want to work with a copy of the elements
  • Use “auto&” when you want to modify elements
  • Use “auto&&” when you want to modify elements in generic code
  • Use “const auto&” when you want read-only access to elements
  • Use “const auto” in multithreading scenarios when you need read-only access to volatile elements
Werbeanzeigen
Veröffentlicht unter C++ | Kommentar hinterlassen

C# Protected Internal vs Private Protected

C# offers the composed access modifiers “protected internal”. With C# 7.2 a new composed access modifier was added: “private protected”. Unfortunately, these modifiers are hard to understand as their names don’t reflect their meaning. Within this article I want to explain the two modifiers and their technical background.

Within the CLR you will find the single access modifiers “Family” and “Assembly”. “Family” means that this object or a derived object has access. Within C# and many other programming languages this “Family” modifier is implemented with the “protected” keyword. The CLR “Assembly” modifier means that a member is accessible by everyone within the defining assembly. The according implementation in C# is done with the “internal” keyword. So far, as we have the single modifiers it is quite easy. But what if we combine those two modifiers?

Within the CLR it stays simple. It offers two compound access modifiers: “Family and Assembly” and “Family or Assembly”. It combines the single modifiers by “And” to create an intersection or by “Or” to create a union.

“Family and Assembly” will allow access from objects of this assembly in case they are derived objects.

“Family or Assembly” will allow access from everyone object within the defining assembly and additional by any derived object outside of the assembly.

In C# things become difficult as they choose an awkward syntax. At first, the support for the “Family or Assembly” CLR access modifier was added to C#. But the C# syntax was not “protected and internal” it was “protected internal” without the “and”. I think that’s a good choice as it keeps things simple. The “and” keyword would be disturbing and unnecessary.

But with C# 7.2 the “Family and Assembly” CLR access modifier should become a part of C#. Now the language designers had the issue that the “protected internal” modifier without the “and” keyword was still part of the language. So how should they name the new modifier? If they choose a syntax like “protected or internal” it would be easy to understand but with the downside of the implicit interpretation of “protected internal” as “protected AND internal” and therefore with the risk that developers by mistake use the wrong modifier as they have nearly the same syntax.

So, the language designers decided to use the syntax “private protected”. This should mean we have the well-known protected relationship between base and derived object and additional the “private” keyword means that this relationship is limited to derived objects of the same assembly. In my opinion that wasn’t a good decision. Of course, the composed keywords are now different and cannot be mixed up by mistake, but they are awkward and confusing as they don’t reflect their meaning.

But of course, we should not criticize this decision too much because it was a decision between bad alternatives only. There was no possibility to add the “Family and Assembly” feature in a clean way. The crucial mistake was already done as the “Family or Assembly” feature was added to C#. The language designers choose a syntax without respect to the possibility that the “Family and Assembly” feature will be added in future.

Veröffentlicht unter .NET, C# | Kommentar hinterlassen

Ref Return and Ref Locals in C# 7

The C# language supports passing arguments by value or by reference since the first language version. But returning a value was possible by value only. This has been changed in C# 7 by introducing two new features: ref returns and ref locals. With these new features it is possible to return by reference. ‚Ref return‘ allows to return an alias to an existing variable and ‚ref local‘ can store this alias in a local variable.

The main goal of this language extension is allowing developers to pass around references to value types instead of copies of the values. This is important when working with large data structures implemented as value types. Of course, the new feature can be used with reference types too but reference types will already be returned as pointers and you will not have advantages if you return them as reference to pointer.

Return by value

The following source code shows an example for a return by value. The function returns the second element of the list. As it is a return by value, a copy of the element will be returned. A modification of the returned element is done within the copy and not within the origin object instance.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
  new Person() {mName = "John Doe", mAge = 31 },
  new Person() {mName = "Jane Doe", mAge = 27 },
  };

  Person person = GetSecond(persons);
  person.mAge = 41;

  // output:
  // 'John Doe (31)'
  // 'Jane Doe (27)'
  foreach (Person p in persons)
  {
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

struct Person
{
  public string mName;
  public int mAge;
}

static Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return persons[1];
}

If you want to find and use the origin element you had to implement the method in a different way, for example return the found index and use this index to modify the origin list. With the new ‘ref return’ feature it becomes possible to implement the needed behavior very easily. You just must change the return from a value to a reference.

Return by reference

The following source code shows the same adapted example. This time the found list item is returned as reference. The reference is sored within the ref local variable. Changes mode to this variable are mode to the referenced object instance. So the origin list item will be changed.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'Jane Doe (41)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return ref persons[1];
}

Use a List instead of an Array

Within the previous example I used an array of struct objects. What do you think will happen if we change to another container type, for example to a list?

static void Main(string[] args)
{
  List persons = new List
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(List persons)
{
  if (persons.Count < 2) throw new ArgumentException();

  // error CS8156: An expression cannot be used in this context
  // because it may not be passed or returned by reference
  return ref persons[1];
}

It isn’t longer possible to compile the code. The ‘return ref persons[1]’ statement results in an compiler error. That’s because of a different implementation of the array indexer and the list indexer. The array indexer returns a reference to the list item. So, we can use this reference as return value. The list indexer instead returns a copy of the value. As it isn’t allowed to return the indexer expression itself nor the returned temporary local variable, the compiler will show an according error message. Within the following article you can find further information about this issue.

Ref local

Within the example application we have stored the returned reference within a ‘ref local’ variable. A ref local variable is an alias to the origin object instance. It is initialized by the ref return value. The reference itself is constant after this initialization. Therefore, an assignment to a ref local will not change the reference but it will change the content of the referenced object.

The following source code shows an according example.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  ref Person person = ref persons[1];
  person.mAge = 41;
  person = persons[0];
  person.mAge = 51;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'John Doe (51)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

At first the ref local is initialized with a reference to the second list item. Then the age is changed to 41. Now we have another assignment which is very interesting. ‘persons[0]’ returns a reference to the first list item. But an assignment to our ref local variable will not change the reference. The reference is set during initialization and stays constant. The assignment will change the value of the referenced object. Therefore, the second list item – which is referenced by the ref local variable – will be changed to the values which are stored within the first list item, which are ‘John Doe’ and ‘31’. At next we set the age to ‘51’. So, the output is like shown in comment within the example application. The first list item is not changed at all and the second list item was updated with the name stored within the first item and an age set due to the last assignment.

Ref vs. Pointer

The previous example and this topic in general may raise the question about the difference between a reference and a pointer. So, we should take a minute and think about this question as it is important for a deep understanding of the ref local and ref return mechanism.

Briefly summarized you can say: References and pointers do both refer to an object instance. But references are constant after initialization where pointers can be changed.

This single difference is an important one as it results in different characteristics and possibilities of the two concepts. Following I want to mention some important ones. As references are constant they cannot be null. Pointers instead can be reassigned and as consequence they can also set to null. You can have pointers to pointers and create extra levels of indirection, whereas references only offer one level of indirection. As pointers can be reassigned, various arithmetic operations can be performed on them, which is called ‘pointer arithmetic’. It is easier to work with references as they cannot be null and you don’t have to think about indirection. But it is not safer to work with references because pointers as well as references can refer to invalid objects or memory locations.

Please look at the following functions and think about the parameter kind: is it a reference or a pointer or a value copy?

Function Parameter kind
void Foo(MyStruct x) Copy of the value passed by the caller
void Foo(MyClass x) Pointer to the origin object instance which is available on caller level
void Foo(ref MyStruct x) Reference to the origin object instance which is available on caller level
void Foo(ref MyClass x) Reference to the pointer to the origin object instance which is available on caller level   Often C# developers say that’s a “pointer to a pointer” but in fact it is a reference to a pointer. As you know after reading above comparison of the two concepts that’s a small but important difference.

Use ref return without ref local

You can define a method with ref return and assign the result to a variable which is not ref local. In this case the content which is referenced by temporary variable of the method result will be copied to the local variable. So your local variable is a copy of the origin list item and changes will therefore not affect the list item.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
      new Person() {mName = "John Doe", mAge = 31 },
      new Person() {mName = "Jane Doe", mAge = 27 },
  };

  Person person = GetSecond(persons);
  person.mAge = 41;

  foreach (Person p in persons)
  {
    // output:
    // 'John Doe (31)'
    // 'Jane Doe (27)'
    Console.WriteLine(p.mName + " (" + p.mAge + ")");
  }
}

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) throw new ArgumentException();

  return ref persons[1];
}

Ref return of method local variable

A ref return creates a reference to an existing object instance. If the lifetime of the object instance is shorter than the lifetime of the reference, the reference will refer to an invalid object or memory location. This will result in critical runtime errors. Therefore, the referred object instance must be in a higher scope or in the same scope as the ref local variable. You cannot create a method local object instance and return a reference to this object instance. The following source code shows an according example with the compiler error messages as comment.

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2)
  {
    // error: 
    // An expression cannot be used in this context 
    // because it may not be passed or returned by reference
    return ref new Person();

    // error:
    // Cannot return local 'person' by reference because it is not a ref local
    Person person = new Person();
    return ref person;
  }

  return ref persons[1];
}

Return null

As we already learned, a reference cannot be null. Therefore, a method with ref return cannot return null. Within the examples so far, we have thrown an error if the parameter is invalid. But exceptions should be thrown in exceptional cases only. In my opinion it is not an exceptional case if we pass a list with less than two elements to the ‘GetSecond’ method. So, I don’t want to throw an exception but as no list item is found I want to return an invalid element. For reference types I want to return null and for value types I want to return a default value. But as we have seen, whether it is possible to create a local default value and return it by reference nor it is possible to return null. But it is possible to return a reference to an object instance if the object is in higher scope. We can use this possibility and define a default value for an invalid list item.

The following source code shows an according example with a list of value types.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
    new Person() {mName = "John Doe", mAge = 31 }
  };

  ref Person person = ref GetSecond(persons);
  person.mAge = 41;
}

static Person gDefaultPerson = new Person();

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) return ref gDefaultPerson;

  return ref persons[1];
}

And we can adapt the example for a list of reference types.

static void Main(string[] args)
{
  Person[] persons = new Person[]
  {
        new Person() {mName = "John Doe", mAge = 31 }
  };

  ref Person person = ref GetSecond(persons);

  if (person != null) person.mAge = 41;
}

static Person gDefaultPerson = null;

static ref Person GetSecond(Person[] persons)
{
  if (persons.Length < 2) return ref gDefaultPerson;

  return ref persons[1];
}

But there is one very critical design fault within this implementation. The returned default value can be changed. So, a next method call or a method call by another client may return a default value with changed content. This may lead to undefined behavior in the application. But what if we use an immutable object for the default value? This will solve the issue and allows to use this implementation concept. So, you must implement an immutable object and return a reference to this constant object instance. With C# 7.2 it will be possible to use the readonly modifier for structs and ref returns. This will make it even more comfortable to create and use immutable structs.

‘In’ modifier and ‘Readonly’ modifier for struct und ref return

The code examples of this article were created with C# 7.0. With C# 7.2 you can use two additional features which allows to write more performant code. These features are the ‘in’ modifier for method parameters and the ‘readonly’ modifier for ref returns and for structs.

Method parameters are often used as input for the method. So, they will not be changed within the method. If you use a struct as method parameter it is passed by value. In this case the runtime creates a copy of the struct instance and pass the copy to the method. This language design concept allows to use the method parameter as method local value without any side effect to the origin struct instance outside of the method scope. But of course, this comes with the disadvantage of performance loss as it may be expensive to create the copy of the struct.

But do we need a copy at all if we just read the parameter values? Of course not! In this case it would be fine to pass a reference to the origin struct. But it must be guaranteed that it is used to read values only. This is exactly the idea of the ‘in’ modifier. As well as the ‘out’ and ‘ref’ modifiers, the parameter will be passed as reference. Additional a ‘in’ parameter will become read only. So, you cannot assign a new value to the parameter. This is comparable to the “pass by const reference” principle in C++.

In theory the ‘in’ modifier is a nice and easy way to improve the performance of method calls with struct parameters. But unfortunately, it isn’t that easy. Depending on the implementation of the struct the compiler must create a copy of the parameter even if you use the ‘in’ modifier. This procedure is called ‘defensive copy’. It is used in case the compiler cannot guarantee that the parameter will not be changed inside the method. Of course, the compiler can prevent direct assignments. But if you call a struct method, the compiler may not know if the member method changes the internal state of the struct. In such situations the defensive copy is created.

To prevent a creation of a defensive copy you can implement an immutable struct. In this case you must use the ‘readonly’ modifier for the class declaration. A readonly struct cannot be changed. Even member methods cannot change the internal state. If you pass such a readonly struct instance as in parameter to a method the compiler knows that the value stays constant and does not have to create a defensive copy.

The ‘readonly’ modifier is moreover available for the ref return value of a method. Of course, the reference itself is constant by definition, so this readonly modifier means that the referenced object instance is constant.

Summary

Ref returns and ref locals help to write more performant code as there’s no need to move copies of values between methods. These enhancements are designed for performance critical algorithms where minimizing memory allocations is a major factor. For the same reason the ‘in’ modifier and the ‘readonly’ modifier for structs were introduced. To pass readonly structs as in parameters to methods may increase the application performance.

Veröffentlicht unter .NET, C# | Kommentar hinterlassen

Pattern Matching in C# 7

Patterns are used to test whether a value matches a specific expectation and if it matches patterns allow to extract information from the value. You already create such pattern matchings by writing if and switch statements. With these statements you test values and if they match the expectation you extract and use the values information.

With C# 7 we got an extension to the syntax for is and case statements. This syntax extension allows combine the two steps: testing a value and extract its information.

Introduction

Let’s start with a basic example to see what we are talking about. The following source code shows how to test whether a value is of specific type and then use the value for a console output. The code shows the old and new syntax so you can compare these two implementations. As you can see the new syntax combines the value testing and information extraction in one short statement.

static void Main(string[] args)
{
  WriteValueCS7("abc");
  WriteValueCS6(15);
  WriteValueCS7(18.4);
}

static void WriteValueCS7(dynamic x)
{
  //C# 7
  if (x is int i) Console.WriteLine("integer: " + i);
  else if (x is string s) Console.WriteLine("string: " + s);
  else Console.WriteLine("not supported type");
}

static void WriteValueCS6(dynamic x)
{
  //C# 6
  if (x is int)
  {
    var i = (int)x;
    Console.WriteLine("integer: " + i);
  }
  else if (x is string)
  {
    var s = x as string;
    Console.WriteLine("string: " + s);
  }
  else
  {
    Console.WriteLine("not supported type");
  }
}

The example shows pattern matching used in an is-expression to do a type check. The new pattern matching syntax is furthermore supported in case-expressions and it allows three different type of patterns: the type pattern, the const pattern and the var pattern. We will see these different possibilities within the next paragraphs.

Type Pattern

We have already seen the type pattern matching within the previous example. It is used to check whether a value is of a specific type. If the type is matching a new variable of this type is created and can be used to extract the value information. If a value is null, the type check always returns false. The following source code shows an according example.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: 'not supported type'
  WriteValue(c);  // output: 'integer: 15'
}

static void WriteValue(dynamic x)
{
  if (x is int i) Console.WriteLine("integer: " + i);
  else if (x is string s) Console.WriteLine("string: " + s);
  else Console.WriteLine("not supported type");
}

Const Pattern

The pattern matching can be used to check whether the value matches a constant. Within this pattern you cannot create a new variable with the value information as the value already matches a constant and can be used as it is.

static void Main(string[] args)
{
string a = "abc";
string b = null;
int c = 15;
int d = 17;

WriteValue(a);  // output: 'const: abc'
WriteValue(b);  // output: 'const: null'
WriteValue(c);  // output: 'const: 15'
WriteValue(d);  // output: 'unknown'
}

static void WriteValue(dynamic x)
{
if (x is 15) Console.WriteLine("const: 15");
else if (x is "abc") Console.WriteLine("const: abc");
else if (x is null) Console.WriteLine("const: null");
else Console.WriteLine("unknown");
}

Var Pattern

The var pattern is a special case of the type pattern with one major distinction: the pattern will match any value, even if the value is null. Following we see the example previously used for the type pattern, extended with the var pattern.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: 'not supported type'
  WriteValue(c);  // output: 'integer: 15'
}

static void WriteValue(dynamic x)
{
  if (x is int i) Console.WriteLine("integer: " + i);
  else if (x is string s) Console.WriteLine("string: " + s);
  else if (x is var v) Console.WriteLine("not supported type");
}

If we look at this example we may ask two critical questions: Why do we have to specify a temporary variable for the var pattern if we dont use it? And why do we use the var pattern at all is it is the same as the empty (default) else-statement?

The first question is easy to answer. If we use the var pattern and don’t need the target variable we can use the discard wildcard „_“ which was also introduced with C# 7.

The second question is more difficult. As described, the var pattern always matches. So, it represents a default case, which is the empty else in an if-else statement. Therefore, if we just want to write the default else-case we should not use the var pattern at all. But the var pattern proves to be practical as we want to distinguish between different groups of default-cases. The following code shows an according example. It uses more than one var-pattern to handle the default-case in more detail. As mentioned above the last var pattern is unnecessary and you can write an empty else. I used the var pattern anyway to show you how to use the discard character.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;
  double d = 17.5;
  Guid e = Guid.NewGuid();

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: ''null' is not supported'
  WriteValue(c);  // output: 'integer: 15'
  WriteValue(d);  // output: 'not supported primitive type'
  WriteValue(e);  // output: 'not supported type'
}

static void WriteValue(dynamic x)
{
  if (x is int i) Console.WriteLine("integer: " i);
  else if (x is string s) Console.WriteLine("string: " + s);
  else if ((x is var v) && (v == null)) Console.WriteLine("'null' is not supported");
  else if ((x is var o) && (o.GetType().IsPrimitive)) Console.WriteLine("not supported primitive type");
  else if (x is var _) Console.WriteLine("not supported type");
}

Switch-case

At the beginning of the article I mentioned that pattern matching can be used in if-statements and switch-statements. Now we know the three types of pattern matching and have used them in if-statements. At next we will see how to use the patterns in switch-statements.

The switch-statement so far was a pattern expression. it supported the const pattern only and was limited to numeric types and the string type. With C# 7 those restrictions have been removed. Now the switch-statement supports pattern matching and therefore all three patterns can be used. Furthermore, a variable of any type may be used in a switch statement.

The new possibilities have an side-effect which made it necessary to change the behavior of the switch-case-statement. So far, the switch statement supported const pattern only and therefore the case-clauses were unique. With the new pattern matching the case-clauses can overlap and may not be unique anymore. Therefore, the order of the case-clauses matters. For example, the compiler emits an error if the previous clause matches a base type and the next clause matches a derived type. Because of the possible overlapping case-clauses, each case must end with a break or return. This prevents code execution to „fall through“ from one case expression to the next.

The following example shows the type pattern used in an switch-case-statement.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: 'not supported type'
  WriteValue(c);  // output: 'integer: 15'
}

static void WriteValue(dynamic x)
{
  switch (x)
  {
    case int i: Console.WriteLine("integer: " + i); break;
    case string s: Console.WriteLine("string: " + s); break;
    default: Console.WriteLine("not supported type"); break;
  }
}

Switch-case with predicates

Another feature related to pattern matching is the ability to use predicates within the switch-case-statement. Within a case-clause a when-clause can be used to do more specific checks.

The following source code shows the use case we already seen in the var pattern example. But this time we use the switch-case and where statements instead of the if-statement.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;
  double d = 17.5;
  Guid e = Guid.NewGuid();

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: ''null' is not supported'
  WriteValue(c);  // output: 'integer: 15'
  WriteValue(d);  // output: 'not supported primitive type'
  WriteValue(e);  // output: 'not supported type'
}

static void WriteValue(dynamic x)
{
  switch (x)
  {
    case int i: Console.WriteLine("integer: " + i); break;
    case string s: Console.WriteLine("string: " + s); break;
    case var v when v == null: Console.WriteLine("'null' is not supported"); break;
    case var o when o.GetType().IsPrimitive: Console.WriteLine("not supported primitive type"); break;
    default: Console.WriteLine("not supported type"); break;
  }
}

Scope of pattern variables

A variable introduced within a type pattern or var pattern in an if-statement is lifted to the outer scope. This leads to strange behavior of the compiler. On the one hand it is not meaningful to use the variable outside the if-statement because it may not be initialized. And on the other hand, the compiler behavior is different for an if-statement and an else-if statement. But maybe this strange behavior will be fixed in a next compiler version. The following source code shows an according example with the compiler errors as comments.

static void Main(string[] args)
{
  string a = "abc";
  string b = null;
  int c = 15;

  WriteValue(a);  // output: 'string: abc'
  WriteValue(b);  // output: 'not supported type'
  WriteValue(c);  // output: 'integer: 15'
}

static void WriteValue(dynamic x)
{
  if (x is int i) Console.WriteLine("integer: " + i);
  else if (x is string s) Console.WriteLine("string: " + s);
  else Console.WriteLine("not supported type");

  Console.WriteLine(i); // error: Use of unassigned local variable 'i'
  i = 15; // ok      

  // s = "abc";  // error: The name 's' does not exist in the current context
  string s = "abc"; // error: 's' cannot be declared in this scope because that name is used in a local or parameter
}

Pattern variables created inside a case-clause are only valid within the case-clause. They are not lifted outside the switch-case scope. In my opinion this leads to a clean separation of concerns and it would be nice to have the same behavior in if-statements.

Summary

Pattern matching is a powerful concept. The pattern matching possibilities introduced with C# 7 offer nice ways to write complex if-statements and switch-statements in a clean way. The patterns introduced so far are just some base ones and with C# 8 it is planned to add some more advanced ones like recursive pattern, positional pattern and property pattern. So, this programming concept is not just syntactical sugar, it will become an important concept in C# and introduces more and more functional programming techniques to the language.

Veröffentlicht unter .NET, C# | Kommentar hinterlassen

C++17: initializers in if-statement and switch-statement

With C++17 it is possible to initialize a variable inside an if-statement and a switch-statement. We already know and use this concept in the for-statement. To be honest: I don’t like this feature. Within this article I want to introduce this feature and explain my doubts. Following I will write about the if-statement only because everything also applies to the switch-statement and so it is sufficient to show one of both.

The new syntax with the initializer inside the if-statement comes with a big improvement: the variable is moved inside the scope of the if-block. An important software design concept is to use the smallest scope as possible and the new syntax helps to implement according to this design concept.

But you must pay dearly for this advantage. As the initialization moves into the if-statement, initialization and comparison will be mixed up. This violates two other software design concepts, named “separation of concerns” and “keep it simple”. Depending on the complexity of the initialization and the comparison you may create a very complex if-statement. This may result in hard to read and error prone code. Only in case you have a very simple initialization and a very simple comparison, the combination of both may stay simple as well. In all other cases I recommend avoiding the new feature and clearly separate the initialization and the comparison in order to increase the code readability.

Let’s have a look at a simple example. The following source code shows an if-statement with an included variable initialization and the same if-statement with a separation of the initialization and the comparison. Furthermore, just for fun, I removed the line breaks for the second example to compare it with the new syntax.

// init inside if
if (int count = CalcCount(); count > 100)
{
	std::cout << "count: " << count < 100)
{
	std::cout << "count: " << count < 100)
{
	std::cout << "count: " << count << std::endl;
}

If we compare the first and the second implementation – so if we compare new and classical syntax – we could say that the differences are small. In my opinion both variants are easy to read. Even the third one, with classical syntax but without line breaks, may be easy to read, even if it is unusual. But you can see that the new syntax isn’t that different from the classical without line break. Just the if-statement moved at the front. Of course, this little change increased the readability a lot.

So, we must look at a more complex example. Let’s see how things look like if we increase the complexity of the initialization, but leave the comparison as simple as before.

// init inside if
if (int count = IsInitialized() ? CalcCount() : (CalcExpectedCount() + CalcOldCount()) / 2; count > 100)
{
	std::cout << "count: " << count < 100)
{
	std::cout << "count: " << count < 100)
{
	std::cout << "count: " << count << std::endl;
}

At first you can see the new syntax. In my opinion this if-statement is very hard to read. You have to stop at this line of code, read it several times and look closely to understand the meaning of the code.

The second implementation separates the initialization and the comparison. I think this will make it a little bit easier to read the code.

The third example clearly separates the different concerns. We have a standard initialization, an initialization for fallback cases and a comparison. This source code is easy to read. You don’t have to stop reading at any line of code as you must read it again to understand it. The complex initialization and comparison is spitted into simple parts.

Summary

Initializers in if-statements and switch-statements allow a clear assignment of the variable to the scope of the statement. But mixing the two concerns of initialization and comparison will often result in complex code. Therefore, in my opinion, the new syntax should be used with caution. If the initialization as well as the comparison is short and simple the resulting combination of both may be simple too and in this case the new syntax should be used.

Veröffentlicht unter C++ | Kommentar hinterlassen

Name hiding in inheritance

The C++ name hiding rules for variables are well known by software developers. In contrast, name hiding in inheritance sometimes leads to issues although it follows the same rules. Such issues are therefore not a result of the rules but an effect due to different expectations in these different programming scenarios.

Name hiding rules for variables

Let’s start with the well-known rules for variables. The following example shows a typical scenario.

double x;
	
int main()
{
	int x;

	x = 5.5;	// conversion from double to int

	std::cout << x << std::endl;	// prints 5

	::x = 8.8;	// changes the global x

	std::cout << x << std::endl;	// prints 5
	
	return 0;
}

Within this example you see a double variable within the global scope and an integer variable within the local scope. As they have the same names, the local variable hides the global one, even if they have different types. If we want to access the double variable we must use the global namespace explicitly.

Name hiding rules in inheritance

At next, let us try the same name hiding within an inheritance scenario.

class Base
{
public:
	void Calc(double x) { std::cout << "Base calc was called" << std::endl; }
};

class Derived : public Base
{
public:
	void Calc(int x) { std::cout << "Derived calc was called" << std::endl; }
};

int main()
{
	Derived d;

	d.Calc(3.5);	// derived is called, conversion of double to int
	d.Calc(8);		// derived is called

	return 0;
}

Within this example we have a function with a double parameter in the base class and a function with an integer parameter in the derived class. The same name hiding rules like in the example before are still used. The local function of the derived class will hide the global function of the base class even if the function parameters are different.

Some developers are surprised by this behavior as they expect that the public functions of the base class will become functions of the derived class too and a function call with respect to the function parameters types can be executed. That’s a valid and comprehensible expectation because from a software architectural point of view public inheritance represents a “is-a” relationship.

From technical point of view the different kinds of inheritance just result in different visibilities of interfaces. Therefore, name hiding should be seen with respect to this technical point of view. So, the behavior of the example application is correct.

Make hidden names visible

Of course, there are many use cases where you want to keep the hidden names visible. For example, in public inheritance scenarios you normally want to have the interface visible and it should be a rare case to keep it hidden. As expected, this behavior was respected for C++. With the “using” statement the hidden names will become visible again. The following example shows the according modification of the derived class. This time the base class method is called if we pass a double parameter.

class Base
{
public:
	void Calc(double x) { std::cout << "Base calc was called" << std::endl; }
};

class Derived : public Base
{
public:
	using Base::Calc;	// make base class function name visible in derived class

	void Calc(int x) { std::cout << "Derived calc was called" << std::endl; }
};

int main()
{
	Derived d;

	d.Calc(3.5);	// base is called
	d.Calc(8);		// derived is called

	return 0;
}

Make specific hidden name visible

Within the previous example we have seen the “using” statement to make hidden names visible. As we already learned, names are independent of data types. Therefore, if we have different functions with the same name implemented in the base class, these functions will become visible. The following source code shows an according example. Furthermore, I have changed to private inheritance to show that the concept is independent of the inheritance kind.

class Base
{
public:
	void Calc(double x) { std::cout << "Base calc double was called" << std::endl; }

	void Calc(std::string x) { std::cout << "Base calc string was called" << std::endl; }
};

class Derived : private Base
{
public:
	using Base::Calc;	// make base class function name visible in derived class

	void Calc(int x) { std::cout << "Derived calc was called" << std::endl; }
};

int main()
{
	Derived d;

	d.Calc(3.5);	// base is called
	d.Calc(8);		// derived is called

	d.Calc("abc");

	return 0;
}

Based on a technical point of view the example looks fine. But from a software architectural point of view you may argue that it is bad design if we make the private interface public in derived class. And you are totally right. But sometimes such design decisions are made for several reasons. But in this case you want to keep the design fault as small as possible and make only one or a few of the available functions public. You can do this by using forward declaration instead of the “using” declaration. The following source code shows the adapted example.

class Base
{
public:
	void Calc(double x) { std::cout << "Base calc double was called" << std::endl; }

	void Calc(std::string x) { std::cout << "Base calc string was called" << std::endl; }
};

class Derived : private Base
{
public:
	void Calc(double x) { Base::Calc(x); };

	void Calc(int x) { std::cout << "Derived calc was called" << std::endl; }
};

int main()
{
	Derived d;

	d.Calc(3.5);	// base is called
	d.Calc(8);		// derived is called

	d.Calc("abc");	// compiler error

	return 0;
}

Summary

Names in derived classes hide names of base classes. This behavior is correct from a technical point of view. But in case of public inheritance it contradicts our expectations from a software architectural point of view. But we can easily make the hidden names visible again with the “using” declaration or with forward declarations.

Veröffentlicht unter C++ | Kommentar hinterlassen

Expression Bodied Members in C# 7

The concept of Expression Bodied Members (EBM) was introduced with C# 6 and as it becomes popular, many enhancements were added with C# 7. Within this article I want to give you the full picture of this feature so I explain the C# 6 and C# 7 EBM features.

With C# 6 the following EBM were introduced:

  • Expression bodied Methods
  • Expression bodied Properties

With C# 7 the following EBM were added:

  • Expression bodied Property Getter
  • Expression bodied Property Setter
  • Expression bodied Indexer
  • Expression bodied Operators Overloading
  • Expression bodied Constructor
  • Expression bodied Destructor (Finalizer)

Syntax

Member methods as well as property getters and property setters are sometimes implemented with a single instruction. In such cases the syntax overhead like brackets or getter and setter syntax is larger than the syntax for the real functionality. EBM allow to reduce the syntax overhead and therefore bring back the focus on the real functionality. This will increase the readability of the source code. For EBM a syntax is used which is already known from lambda expression: the “=>” sign. In contrast to lambda expressions you are limited to a single instruction. I think that’s a really good design decision because the EBM syntax only makes sense in such special cases. If your class members like constructor, property getters or methods contain several instructions, the bracket syntax used so far is more suitable. The instruction which belong together will be written into one block and therefore you can easily see that they belong together. But if you have one instruction only there is nothing which must be grouped. So, in this case it makes sense to leave out the block syntax and use a more lightweight style.

Examples

The following paragraphs show examples for each EBM type. As the examples are quite easily and self-explaining you will not find further descriptions or explanations. But at the end of the article you will find a summary and my personally thinking about the EBM feature.

Each example contains the same implementation twice: one time in EBM syntax and one time in standard block syntax. This allows an easy comparison of both implementation styles. But of course, if you want to compile the source code you must comment out one of the two implementations.

Expression bodied Methods

static void Main(string[] args)
{
  MyClass myClass = new MyClass();

  int result = myClass.Sum(3, 5);

  Console.WriteLine(result);
}

class MyClass
{
  // C# 6
  public int Sum(int a, int b) => a + b;

  // C# 5
  public int Sum(int a, int b)
  {
    return a + b;
  }
}

Expression bodied Properties

static void Main(string[] args)
{
  MyClass myClass = new MyClass();

  Console.WriteLine(myClass.Value);
}

class MyClass
{
  // C# 6
  public int Value => mValue;

  // C# 5
  public int Value
  {
    get { return mValue; }
  }

  private int mValue = 12;
}

Expression bodied Property Getter and Setter

static void Main(string[] args)
{
  MyClass myClass = new MyClass();

  myClass.Value = 42;
  Console.WriteLine(myClass.Value);
}

class MyClass
{
  // C# 7
  public int Value
  {
    get => mValue;
    set => mValue = value;
  }

  // C# 6
  public int Value
  {
    get { return mValue; }
    set { mValue = value; }
  }

  private int mValue = 12;
}

Expression bodied Indexer

static void Main(string[] args)
{
  MyClass myClass = new MyClass();

  Console.WriteLine(myClass[2]);
}

class MyClass
{
  // C# 7
  public int this[int index] => mValues[index];

  // C# 6
  public int this[int index]
  {
    get { return mValues[index]; }
  }

  private int[] mValues = new int[] { 11, 12, 13, 14 };
}

Expression bodied Operators Overloading

static void Main(string[] args)
{
  MyClass myClass = new MyClass();

  myClass.Value = 15;
  myClass++;

  Console.WriteLine(myClass.Value);
}

class MyClass
{
  // C# 7
  public static MyClass operator ++(MyClass myClass) => new MyClass() { Value = myClass.Value + 1 };

  // C# 6
  public static MyClass operator ++(MyClass myClass)
  {
    return new MyClass() { Value = myClass.Value + 1 };
  }

  public int Value { get; set; }
}

Expression bodied Constructor and Destructor (Finalizer)

static void Main(string[] args)
{
  MyClass myClass = new MyClass();
}

class MyClass
{
  // C# 7
  public MyClass() => Init();
  ~MyClass() => CleanUp();

  // C# 6
  public MyClass() { Init(); }
  ~MyClass() { CleanUp(); }

  private void Init() { }
  private void CleanUp() { }
}

Summary and Assessment

As you can see within the examples, the source code will become more readable as unnecessary syntax overhead is removed. From my point of view EBM is a quite nice feature. But of course, you should not overuse it. EBM should only be used if you have use a simple instruction. Furthermore, I don’t like to use EBM in ctor or finalizer in cases where you want to manage a single resource only. This feels a little bit inappropriate and most often you have more than one resource within a class. If you just want to call a single method in ctor or finalizer EBM is still fine.

Disadvantage of EBM

I think there is one minor disadvantage of EBM. The used ‘=>’ sign looks nearly like the ‘=’ sign. If you mix up these two signs you may write source code with another behavior than expected. The following example shows such an issue.

static void Main(string[] args)
{
  MyClass x;

  for (int i = 0; i  new MyLargeClass();
}

class MyLargeClass
{
  // ...
}

Both implementations look nearly the same but have a different behavior. One is an initializer the other one is a getter. So, in one case the value has a getter only and in the other case a getter and setter. Furthermore, one getter will always return the same object instance and the other one will always create a new object instance. Within the shown example this may result in huge performance differences depending on the kind of the returned object. Of course, this is a rare issue and it will not result in runtime errors. So, this theoretical disadvantage should not stop you from using EBM.

Veröffentlicht unter .NET, C# | Kommentar hinterlassen