Linq vs Loop: Join

Like in the previous article of this series I want to compare Linq with a classical loop. This time we want to look at data objects which shall be joined to create a result. Again we want to use validated clean data as input and raw data including null pointers.

Use case one: clean data

Let’s start with the first use case. We have a simple data class for a person and a data class for an address. The data classes are linked together by the AddressIdentifier property.

Out of a list of persons and addresses we want to find a specific person by name. The result shall contain the person name and address. To keep it simple we just look for the first person with the name. If the list does not contain the data we are looking for, we shall return a default person and address.

The data classes are defined as following:

public class Person
{        
    public string Name { get; set; }
    public uint Age { get; set; }
    public uint AddressIdentifier { get; set; }

    public static readonly Person Default = new Person()
    {
        Name = "new person",
        Age = 0,
        AddressIdentifier = 0
    };        
}

public class Address
{
    public uint AddressIdentifier { get; set; }

    public string City { get; set; }

    public static readonly Address Default = new Address()
    {
        AddressIdentifier = 0,
        City = "new city"            
    };
}

 

Our demo console application creates a list of data and calls the data query method, first with an existing person and second with a not existing one.

List<Person> persons = new List<Person>();

persons.Add(new Person() { Name = "John Doe", Age = 35, AddressIdentifier = 1 });
persons.Add(new Person() { Name = "Jane Doe", Age = 41, AddressIdentifier = 1 });

List<Address> addresses = new List<Address>();
addresses.Add(new Address() { AddressIdentifier = 1, City = "Chicago" });

//---------

string information;

//search existing person
information = GetPersonInformation(persons, addresses, "Jane Doe");
Console.WriteLine(information);

//search not existing person
information = GetPersonInformation(persons, addresses, "???");
Console.WriteLine(information);

Console.ReadKey();

The data query method shall be implemented twice: by using a loop and by using Linq. We start with the classical loop:

static private string GetPersonInformation(List<Person> persons, List<Address> addresses, string name)
{
    Person actualPerson = Person.Default;
    Address actualAddress = Address.Default;

    foreach (Person person in persons)
    {
        if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase))
        {
            actualPerson = person;
            break;
        }
    }

    foreach (Address address in addresses)
    {
        if (actualPerson.AddressIdentifier == address.AddressIdentifier)
        {
            actualAddress = address;
            break;
        }
    }

    return actualPerson.Name + ", " + actualAddress.City;
}

And we implement the same function by using Linq.

static private string GetPersonInformation(List<Person> persons, List<Address> addresses, string name)
{
    var result = from person in persons
                    join address in addresses
                    on person.AddressIdentifier equals address.AddressIdentifier
                    where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)
                    select new
                    {
                        Name = person.Name,
                        City = address.City
                    };

    var element = result
        .DefaultIfEmpty(new { Name = Person.Default.Name, City = Address.Default.City })
        .First();

    return element.Name + ", " + element.City;
}

 

Code Review for use case one

The query using the loops is easy to understand and contains clean code. You may think about the possibility to extract the both loops and create single methods to find a person and an address. By doing this refactoring you create three very short and simple methods but with a little increase in complexity. Therefore in my opinion a single method with both loops is fine too.

The Linq query is easy to understand too. You have to know some details about Linq, for example the need for the DefaultIfEmpty statement may not be clear in the first moment. Therefore it would be helpful to add some comments to the query to explain why some statements are needed.

I don’t favor any of the two implementations. From my point of view they are coequal.

Use case two: dirty data

The second use case adds an important need: the query must be robust. So the data may for example contain null values. Like in the first use case the method shall return default data if the person we looking for is not found. Null values or not initialized list shall not throw an error. In this case also the default data shall be returned.

In our test console application we create some dirty data. And we add additional tests to call the function with the data or even with null parameters.

List<Person> persons = new List<Person>();

persons.Add(new Person() { Name = "John Doe", Age = 35, AddressIdentifier = 1 });
persons.Add(null);
persons.Add(new Person() { Name = null, Age = 38, AddressIdentifier = 2 });
persons.Add(new Person() { Name = "Jane Doe", Age = 41, AddressIdentifier = 3 });
persons.Add(new Person() { Name = "Jane Foe", Age = 41, AddressIdentifier = 4 });

List<Address> addresses = new List<Address>();
addresses.Add(new Address() { AddressIdentifier = 1, City = "Chicago" });
addresses.Add(new Address() { AddressIdentifier = 2, City = null });
addresses.Add(null);
addresses.Add(new Address() { AddressIdentifier = 3, City = "Chicago" });            

//---------

string information;

//search existing person
information = GetPersonInformation(persons, addresses, "Jane Doe");
Console.WriteLine(information);

information = GetPersonInformation(persons, addresses, "Jane Foe");
Console.WriteLine(information);

//search not existing person
information = GetPersonInformation(persons, addresses, "???");
Console.WriteLine(information);

//search in a list which is not yet initialized
information = GetPersonInformation(null, addresses, "???");
Console.WriteLine(information);

information = GetPersonInformation(persons, null, "???");
Console.WriteLine(information);

information = GetPersonInformation(null, null, "???");
Console.WriteLine(information);

Console.ReadKey();  

The implemented query using the loop must be adapted to handle all these special cases. The following source code shows an according implementation. The list and the list content will be checked for null values.

static private string GetPersonInformation(List<Person> persons, List<Address> addresses, string name)
{
    Person actualPerson = Person.Default;
    Address actualAddress = Address.Default;

    if (persons != null)
    {
        foreach (Person person in persons)
        {
            if (person == null)
            {
                continue;
            }

            if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase))
            {
                actualPerson = person;
                break;
            }
        }
    }

    if (addresses != null)
    {
        foreach (Address address in addresses)
        {
            if (address == null)
            {
                continue;
            }

            if (actualPerson.AddressIdentifier == address.AddressIdentifier)
            {
                actualAddress = address;
                break;
            }
        }
    }

    return actualPerson.Name + ", " + actualAddress.City;
}

 

The implementation of the Linq query must be adapted too. A check of the whole list, as well of the single element is added.

static private string GetPersonInformation(List<Person> persons, List<Address> addresses, string name)
{
    if(persons == null)
    {
        persons = new List<Person>();
    }

    if(addresses == null)
    {
        addresses = new List<Address>();
    }

    var result = from person in persons.Where(p => p != null)
                    join address in addresses.Where(a => a != null) 
                    on person.AddressIdentifier equals address.AddressIdentifier                         
                    where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)
                    select new
                    {
                        Name = person.Name,
                        City = address.City
                    };

    var element = result
        .DefaultIfEmpty(new { Name = Person.Default.Name, City = Address.Default.City })
        .First();

    return element.Name + ", " + element.City;
}      

Code Review for use case two

The method containing the loops gets more complex with all the if-statements. Therefore you should extract the two loops and create single methods looking for a person and an address. By doing this little refactoring the loop implementation will become very easy to understand.

The Linq implementation was not changed much. Before execution of the query some data checks are done. But there is a little detail, the additional small queries within the in-parts. These queries are needed to remove null objects. I think you have some possibilities to refactor this implementation. You may extract the nested queries or the data checks. Or in case you want to leave the complex query as it is, you should add comments to explain it a little bit.

Without refactoring I don’t like any of these two implementations as they are some kind of complex. I would like to have two separate query methos, one for the person and the other one for the address and an additional managing method which calls these two query methods and joins the result. The single query methods as well as the join can be implemented with simple Linq statements.

Veröffentlicht unter .NET, C#, LINQ | Kommentar hinterlassen

RValue Reference Declarator: &&

The rvalue reference is a nice c++ feature to create efficient source code. Within this article i want to explain what is meant with an rvalue and how you can use the reference declarator. Furthermore you will learn how to use this feature to implement a move constructor and functions which moves parameters.

 

LValue and RValue

In the earliest days of C the lvalue was defined as an expression that may appear on the left or on the right hand side of an assignment, whereas an rvalue is an expression that can only appear on the right hand side of an assignment. For example within the assignment “int a = 21;” the expression “a” is a lvalue and “21” is a rvalue. Of course the lvalue “a” may also be placed on the right hand side of an assignment. For example in assignment “int b = a” both expressions are lvalues.

In C++ this definition is still useful as a first intuitive approach. But we can add another point of view: Lvalues are named objects that persist beyond a single expression and rvalues are unnamed temporaries that evaporate at the end of the expression.

The following example will show some lvalues and rvalues.

int _tmain(int argc, _TCHAR* argv[])
{
  int x;
  int y;
  
  // lvalues
  std::string a;
  std::string* b = &a;
  ++x;

  // rvalues
  123;
  x + y;
  std::string("rvalue");
  x++;

  return 0;
}

 

The first two assignments are very intuitive. The expressions create the named objects “a” and “b” which are lvalues. The first three rvalue examples are easy to understand too. “123” and the result of “x+y” are unnamed objects which are no longer accessible after the end of the expression. The same is true for the created string “rvalue”. An string object is created but it is not named and not accessible after the expression. But you may be astonished by locking at the increment operator. Why is “++x” an lvalue and “x++” an rvalue?

The expression “++x” is an lvalue because it modifies and then names the persistent object. “x++” instead will create a copy of the persistent object, increases the object value and returns the copy. Therefore the expression “x++” will return an unnamed not persistent object, an rvalue.

This little example shows a very important aspect about the difference of lvalues and rvalues. It is not about what an expression does, it is about what an expression names:  something persistent or something not persistent which only exists temporary within the expression. And as persistent objects can be addressed you can also say: If you can address an expression it is an lvalue and if you cannot it is an rvalue. For example “&++x” is valid whereas “&x++” is not.

In summary you may name the following definition: “An lvalue is an expression that refers to a memory location and allows us to take the address of that memory location via the & operator. An rvalue is an expression that is not an lvalue.”

 

RValue reference

A reference to an rvalue can be created by using the double address operator &&. Similar to standard references, or to me more precise lvalue references, you can create rvalue references. This may be used to pass rvalues to functions. Later on we will see how ravlue references will allow us to write a move constructor.

As an rvalue is a temporary object which is only valid in small scope, for example in one expression only, an rvalue reference is a reference to an argument that is about to be destroyed.

 

Constructor performance

Before we start to think about optimization of constructors by using rvalue references we want to have a look at the main issue of standard copy constructors.

To understand the issue we can think about the following: Let’s say you get a folder with a sheet of paper. You shall create a second folder and copy the sheet of paper. What will you do if you get the additional information that the first folder is never used anymore and thrown away? With this information you don’t have to photocopy the sheet of paper. You just have to move the original one to the new folder.

Exactly the same issue can be found in source code. By locking at above example we can now say: The most unnecessary copies are those where the source is about to be destroyed.

With this idea in mind, look at the following line of code, where s1 to s3 are strings.

string x = s1 + “ ” + s2 + “ ” + s3;

By executing this line of code a lot of temporary strings are created and therefore a lot of copy operations are done. Of course this expression will be executed in microseconds and may not need an optimization. But the example shows the general concept which is also valid for large and complex objects. If we come back to the idea we had above, we know that we don’t have to create a copy if the source is about to be destroyed. What does this mean for the string concatenation example? At the start of the expression we concatenate s1 with a blank (s1 + “ ”). In this case it is necessary to create a new temporary string because s1 is an lvalue naming a persistent object. Therefore a copy of its content has to be created. But in the next step we add s2 to this new temporary string created by s1 + “ ”. A second temporary string can be created and a copy of the first one can be concatenated with s2. Afterwards we throw the first temporary string away. And that’s the issue. We create the photocopy of the sheet of papers and throw the origin away. As the first temporary string, which was the result of s1 + “ “, is a rvalue referring to a temporary object we can move the origin content and don’t have to create a copy. This is the key concept of move semantics.

Before we go forward and looking at move constructors someone may think: As we can create rvalue references we have access to the temporary objects. What if we use these references to access the objects later on?

This is a good question. C++ is a language where the developer should have a maximum flexibility. Therefore the language itself will not forbid doing such wrong implementations. If we go back to the initial examples: Your boss told you that the origin folder with the sheet of paper is no longer needed and he will through it away after you have created the new folder. What if he has changed his mind and will use the origin folder later on? This will not work as he now has an empty folder. The same will happen if you access rvalues after their scope. They may contain invalid content or maybe they have pointers to memory locations already used by other objects. Therefore using rvalues after their scope is an implementation issue and may result in critical errors.

 

Move constructors

The standard copy constructors may help to reduce the issue of unnecessary copies but they cannot remove all of them. Move constructors which use rvalue references can help you improve the performance of your applications by eliminating the need for unnecessary memory allocations and copy operations. In general, being able to detect modifiable rvalues allows you to optimize ressorce handling.  If the objects referred to by modifiable rvalues own any resources, you can steal their resources instead of copying them, since they’re going to evaporate anyways.

The following example shows a typical move constructor. The parameter is an rvalue reference to the class. Inside the move constructor you will move the resources from the source object to the new object and you should release the data reference of the origin object to prevent the destructor from releasing them multiple times. As the new object has taken over the resources, the new object is responsible to release them.

class MyClass
{
public:
  MyClass(MyClass&& source) : mData(nullptr)
  {
    // move data
    mData = source.mData;

    // release source data
    // so the destructor does not free the memory multiple times
    source.mData = nullptr;
  }
private:
  std::vector<int>* mData;
};

 

To understand the behavior of the move constructor we want to look at a second example. We will now implement the example from above with the folder containing a sheet of paper. So we implement a folder class containing a vector with strings. In case of the move constructor we want to move the resources from one object to the other. Furthermore I have implemented a second constructor which gets the vector as input parameter. This initialization constructor will also use an rvalue reference to the source data.

class Folder
{
public:
  Folder(){};

  Folder(Folder&& source)
    : mData(std::move(source.mData))
  {    
  }

  Folder(std::vector<std::string>&& data)
    : mData(std::move(data))
  {    
  }

  void ShowSize()
  {
    std::cout << mData.size() << std::endl;
  };

private:
  std::vector<std::string> mData;
};


int _tmain(int argc, _TCHAR* argv[])
{    
  std::vector<std::string> data;
  data.push_back("abc");

  Folder original(std::move(data));
  Folder copy(std::move(original));

  std::cout << data.size() << std::endl;
  original.ShowSize();
  copy.ShowSize();
  
	return 0;
}

 

If we execute the application, the output shows us the size of the different vectors. The initial vector and the one inside the first folder are empty and only the new folder contains the resources. That’s because the move constructor of the vector will move the resources and resets the origin ones.

Within the initializer list of the constructors and on calling the constructors you will find a new function not explained so far: std::move. So we will proceed to look at this functions.

 

std::move

The function std::move enables you to create the rvalue reference to an existing object. Alternatively, you can use the static_cast keyword to cast an lvalue to an rvalue reference: static_cast(mySourceObject);

But why had we use this function? Let us start with the constructor call from above example: Folder copy(std::move(original));

The original object we want to copy is an lvalue. Therefore if we use this object as parameter, the standard copy constructor is called. By using the move function we get the rvalue reference to the object and can pass it to the move constructor. Within the constructor we initialize the vector. Here we have to follow the same principle. If we pass the vector it is an lvalue and a copy is created. But if we convert it to an rvalue reference we can call the move constructor of the vector.

 

Summary

Understanding the concept of rvalues and rvalue references will allow you to create and use move constructors. These move constructors can help you improve the performance of your applications by eliminating the need for unnecessary memory allocations and copy operations.

Veröffentlicht unter C++ | Kommentar hinterlassen

Linq vs Loop: Nested Loop

Like in the previous article of this series I want to compare Linq with a classical loop. This time we want to look at the use case to handle data which contains nested data. Again we want to use validated clean data as input and raw data including null pointers.

Use case one: clean data
Let’s start with the first use case. We have a simple data class for a person and a data class for a person group. The person group contains a list of person. So we can created the data structure with nested data. Out of a list of persons we want to find a specific one by name. To keep it simple we just look for the first person with the name. If the list does not contain the data we are looking for, we shall return a default person object.

The data classes are defined as following:

public class Person
{
    public string Name { get; set; }
    public uint Age { get; set; }

    public static readonly Person Default = new Person()
    {
        Name = "new person",
        Age = 0
    };
}

public class PersonGroup
{
    public string GroupName { get; set; }
    public List<Person> Persons { get; set; }
}

 

Our demo console application creates a list of data and calls the data query method, first with an existing person and second with a not existing one.

List<Person> persons;
List<PersonGroup> groups = new List<PersonGroup>();

persons = new List<Person>();
persons.Add(new Person() { Name = "John Doe", Age = 35 });
persons.Add(new Person() { Name = "John Foe", Age = 47 });
groups.Add(new PersonGroup() { GroupName = "male", Persons = persons });

persons = new List<Person>();
persons.Add(new Person() { Name = "Jane Doe", Age = 41 });
groups.Add(new PersonGroup() { GroupName = "female", Persons = persons });

//---------

Person person;

//search existing person
person = FindPerson(groups, "Jane Doe");
Console.WriteLine("Name: " + person.Name);

//search not existing person
person = FindPerson(groups, "???");
Console.WriteLine("Name: " + person.Name);

Console.ReadKey();

The data query method shall be implemented twice: by using a loop and by using Linq. We start with the classical loop:

static private Person FindPerson(List<PersonGroup> groups, string name)
{
    foreach (PersonGroup group in groups)
    {
        foreach (Person person in group.Persons)
        {
            if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase))
            {
                return person;
            }
        }
    }

    return Person.Default;
}

And we implement the same function by using Linq.

static private Person FindPerson(List<PersonGroup> groups, string name)
{
    var result = from personGroup in groups
                    from person in personGroup.Persons
                    where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)
                    select person;

    return result
        .DefaultIfEmpty<Person>(Person.Default)
        .First<Person>();
}

Code Review for use case one

The loop method is implemented with a simple nested loop containing the data comparison. The source code is clean and easy to understand. The same statement can be given for the Linq implementation. The only difficult part is the “DefaultIfEmpty” statement. In case the developer adds a little comment why the “DefaultIsEmpty” call is done, the Linq query is easy to understand too and I don’t prefer any of the two implementations.

Use case two: dirty data

The second use case adds an important need: the query must be robust. So the data may for example contain null values. Like in the first use case the method shall return a default person if the one we looking for is not found. Null values or not initialized list shall not throw an error. In this case also the default person shall be returned.

In our test console application we create some dirty data. And we add additional tests to call the function with the data or even with null parameters.

List<Person> persons;
List<PersonGroup> groups = new List<PersonGroup>();

persons = new List<Person>();
persons.Add(new Person() { Name = "John Doe", Age = 35 });
persons.Add(new Person() { Name = "John Foe", Age = 47 });
groups.Add(new PersonGroup() { GroupName = "male", Persons = persons });

groups.Add(null);
groups.Add(new PersonGroup() { GroupName = "female", Persons = null });

persons = new List<Person>();
persons.Add(null);
persons.Add(new Person() { Name = null, Age = 41 });
persons.Add(new Person() { Name = "Jane Doe", Age = 41 });
groups.Add(new PersonGroup() { GroupName = "female", Persons = persons });

//---------
Person person;

//search existing person
person = FindPerson(groups, "Jane Doe");
Console.WriteLine("Name: " + person.Name);

//search not existing person
person = FindPerson(groups, "???");
Console.WriteLine("Name: " + person.Name);

//search in a list which is not yet initialized
person = FindPerson(null, "???");
Console.WriteLine("Name: " + person.Name);

Console.ReadKey();

The implemented query using the loop must be adapted to handle all these special cases. The following source code shows an according implementation. The list and the list content will be checked for null values.

static private Person FindPerson(List<PersonGroup> groups, string name)
{
    if (groups == null)
    {
        return Person.Default;
    }

    foreach (PersonGroup group in groups)
    {
        if ((group == null) ||
            (group.Persons == null))
        {
            continue;
        }

        foreach (Person person in group.Persons)
        {
            if (person == null)
            {
                continue;
            }

            if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase))
            {
                return person;
            }
        }
    }

    return Person.Default;
}

The implementation of the Linq query must be adapted too. A check of the whole list, as well of the single element is added.

static private Person FindPerson(List<PersonGroup> groups, string name)
{
    if (groups == null)
    {
        return Person.Default;
    }

    var result = from personGroup in groups
                    where personGroup != null
                    where personGroup.Persons != null
                    from person in personGroup.Persons
                    where person != null
                    where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)
                    select person;

    return result
        .DefaultIfEmpty<Person>(Person.Default)
        .First<Person>();
}

Code Review for use case two

The nested loop gets more complex as the different special cases must be handled. This adds the need for additional if-statements. In such a case you may think about the possibility to extract the inner loop and move it to an own method. By using such an additional method the code is very easy to understand even with the additional if-statements.

The Linq implementation is done by using a query containing an inner query. To handle all the special cases with not initialized data, some additional where-statements were added. This will expand the query a little bit but it stays understandable.

In this case I like the Linq implementation a little bit more compared to the nested loop. But on the other hand the Linq query has a major disadvantage: find an error is difficult and time consuming. You may try this by removing one or several of the where-statements looking for invalid data.

Veröffentlicht unter .NET, C#, Clean Code, LINQ | Kommentar hinterlassen

Design patterns: Composite

The composite design pattern is used to compose objects into a tree structure. So each element can hold a list of sub elements. Optional it may have a link to the parent object within the tree.

Within the following example I want to create a generic base class to implement the behavior of a tree node. This base class can be used for any object to create a tree structure.

At first we define the interface for the tree node.

public interface ITreeNode<T>
{
    T AddChild(ITreeNode<T> child);
    void RemoveChild(ITreeNode<T> child);

    List<ITreeNode<T>> Children { get; }
}

Next we implement the tree node object.

public class TreeNode<T> : ITreeNode<T>
{
    private List<ITreeNode<T>> _children = new List<ITreeNode<T>>();

    public List<ITreeNode<T>> Children
    {
        get
        {
            return _children;
        }
    }

    public T AddChild(ITreeNode<T> child)
    {
        _children.Add(child);

        return (T)child;
    }

    public void RemoveChild(ITreeNode<T> child)
    {
        _children.Remove(child);
    }
}

Now let’s say you want to create a tree of visual elements. To keep it simple our visual element is a shape which has a name property only. By using the generic tree node you can simply create a shape tree node.

public class Shape : TreeNode<Shape>
{
    public Shape(string name)
    {
        Name = name;
    }

    public string Name { get; set; }
}

Finally we will create a console application to test our nice shape object. Within this application we create a complex shape containing a tree structure of sub shapes. And we create a recursive executed method to show our visual tree.

static void Main(string[] args)
{
    Shape root = new Shape("Drawing");

    root.AddChild(new Shape("Circle"));
    Shape rectangle = root.AddChild(new Shape("Rectangle"));
    root.AddChild(new Shape("Line"));

    rectangle.AddChild(new Shape("Dotted Line"));
    rectangle.AddChild(new Shape("Triangle"));

    Display(root);

    Console.ReadKey();
}

private static void Display(Shape rootShape)
{
    Display(rootShape, 1);
}

private static void Display(Shape shape, int level)
{
    if (level > 1)
    {
        Console.Write(new string(' ', (level - 2) * 2));
        Console.Write("> ");
    }
    Console.WriteLine(shape.Name);

    foreach (Shape child in shape.Children)
    {
        Display(child, level + 1);
    }
}
Veröffentlicht unter .NET, C#, Design Pattern | Kommentar hinterlassen

Crash course: Asynchronous Programming with async and await

With the .NET Framework 4.5 the new keywords “async” and “await” were introduced. These keywords will aloe an asynchronous programming which is nearly as easy as synchronous programming. With this article I want to give you a crash course into this topic.

Let’s start with a little example. The following source code shows a WPF application with a button click event. Within this event an asynchronous method call is executed.

public partial class MainWindow : Window
{
    public MainWindow()
    {
        InitializeComponent();
    }

    private async void Button_Click(object sender, RoutedEventArgs e)
    {
        DoSomething(1);

        await DoSomethingAsync(2);

        DoSomething(3);
    }

    private void DoSomething(int x)
    {
    }

    private async Task DoSomethingAsync(int x)
    {
        var task = Task.Run(() => {});

        await task;
    }
}

What will happen within the button click method? At first the method “DoSomething(1)” is executed. As the execution is synchronous, the UI is blocked during this time. Next the method “DoSomethingAsync(2)” is executed. With the “await” keyword we will now return the control to the caller, in this case the UI thread. The method “DoSomethingAsync(2)” will be executed asynchronous. As this method is finished, the remaining part of the calling method is executed and so “DoSomething(3)” is executed.

This little example shows the nice implementation concept of async+await. You can implement your method in the normal synchronous style but by adding the new keywords the execution flow changes to asynchronous.

To create an asynchronous function you have to add the “async” keyword an change the return value to a task. In case you method returns void the Task is without a parameter, otherwise it is a Task where T is type of your return value. The caller can use the returned Tasks variable to check its status. Sometimes you may also have a void as return value. But this is used for fire and forget events only, like in the above example the button clock event.

The async and await keyword will not start an own task. Therefore your method is executed asynchronous only if the called method creates a task and awaits it end. You will see this within the example in method “DoSomethingAsync(2)”. A task is created and started and with the await keyword we wait for the end of the task.

As mentioned above, the big advantage of this implementation pattern is the high similarity of synchronous and asynchronous code. This increases the readability of the otherwise difficult asynchronous method calls. Furthermore it will allow you to change your existing applications from an synchronous to an asynchronous behavior without the need of a refactoring. Of course, the async await pattern will offer some other nice features, like an easy aggregated error handling, but this should not be shown within the context of this crash course. Instead, I want to finish this article with an example how to change existing synchronous code to asynchronous one.

The following source code shows an example of a simple user interface with a button and a text box. On button click a long running calculation is done and the result is shown within a text box.

public partial class MainWindow : Window
{
    public MainWindow()
    {
        InitializeComponent();
    }

    private void Button_Calculate_Click(object sender, RoutedEventArgs e)
    {
        Calculator calculator = new Calculator();
        int result;

        result = calculator.Calculate(5, 7);

        TextBox_Result.Text = result.ToString();
    }
}

public class Calculator
{
    public int Calculate(int a, int b)
    {
        Thread.Sleep(TimeSpan.FromSeconds(3));

        return a + b;
    }
}

During the execution of the calculation the user interface is blocked. Therefore we want to change the implementation to the asynchronous pattern. This can be done with some minor modifications. The long running part is executed within a task. The async and await kewords are added to the calculation method and the return value is changed to Task. Furthermore the async and await keywords are added to the event method too. The following source code shows the modified application.

public partial class MainWindow : Window
{
    public MainWindow()
    {
        InitializeComponent();
    }

    private async void Button_Calculate_Click(object sender, RoutedEventArgs e)
    {
        Calculator calculator = new Calculator();
        int result;

        result = await calculator.CalculateAsync(5, 7);

        TextBox_Result.Text = result.ToString();
    }
}

public class Calculator
{
    public async Task<int> CalculateAsync(int a, int b)
    {
        var task = Task.Run(() =>
        {
            Thread.Sleep(TimeSpan.FromSeconds(3));

            return a + b;
        });

        return await task;
    }
}

Now the calculation is executed asynchronous and the user interface stays responsible. All the task management magic is done by the compiler. The source code modifications were marginal and the resulting source code is nearly identical to the origin source code.

Furthermore this example shows another nice feature of the async await pattern. Normally if you execute code in an own task, this task cannot update user interface elements directly. But in this scenario it is working. This nice behavior is possible because the code behind the await statement is executed in the origin context. This default behavior is mainly needed in user interface code and can be disabled in all other cases with the task method “ConfigureAwait(false)”.

Veröffentlicht unter .NET, C#, Crashkurs | Kommentar hinterlassen

Clean Tuples

Depending on the preferences of a developer you may sometimes find a lot of tuples within the source code. Whether this is good or bad practice is another discussion which is not part of this article. Therefore I don’t want to start a discussion whether you shall use classes, structs or tuples. I want to have a look at tuples form the “clean code” point of view.

In my opinion, tuples have one major disadvantage: The code to access tuples is not easy to read. If you use tuples and you write your code they are very easy to use. But in case you have to read, understand and maintain your code after a while or source code which was written by other developers, tuples can become a mess. The following source code shows an easy example. We define a tuple to store information about a person, we create a person tuple in some function and we access the elements of the tuple.

typedef std::tuple<std::string, std::string, int> Person;

Person CreatePerson()
{
  return std::make_tuple("John", "Doe", 35);
}

int _tmain(int argc, _TCHAR* argv[])
{
  Person person = CreatePerson();

  int age = std::get<2>(person);

  std::cout << age << std::endl;

  return 0;
}

This simple example is easy to read. You will understand the source code immediately and you are able to maintain the code. But I think that’s only possible because the whole source code will fit in one screen. What will happen if you use such tuples in a bigger context? The type definition will be placed in one file, the function to create the tuple in another file and the function which access the tuple elements is in a third file. In such a case you will lose the big picture and you need additional time to collect all the information. You cannot longer just read the code, you have to spend time and explore the code to find all needed information.

Have a look at the function to create the tuple. Within this function you have to know that the first parameter is the first name and the second parameter is the last name of the person. Therefore such source code has a high possibility for errors. And on the other hand if you want to use the tuple, you access the elements by an index. In case of the age parameter there is no issue as the used type is unique within the tuple. But if you want to access the first or last name of the person, you will find the same issue like before. Accessing by an index may lead to errors.

As a result of these issues we will think about a possible solution. How can we create clean source code? We found two issues: Creating the tuple and accessing the elements in a clear way. I think a very easy possibility is to exchange the not speaking index number with a speaking enumerator. By using an index enumerator you will be able to set and get the elements of a tuple by using their name. The following source code shows the adapted example.

typedef std::tuple<std::string, std::string, int> Person;

enum PersonIndex
{
  FirstName,
  LastName,
  Age
};

Person CreatePerson()
{
  Person person = std::make_tuple("", "", 0);

  std::get<FirstName>(person) = "John";
  std::get<LastName>(person) = "Doe";
  std::get<Age>(person) = 35;

  return person;
}

int _tmain(int argc, _TCHAR* argv[])
{  
  Person person = CreatePerson();

  int age = std::get<Age>(person);

  std::cout << age << std::endl;

  return 0;
}

This little modification will result in a lot of advantages. You type definition is followed by the according index enumerator which explains the tuple. So you don’t have to explain the tuple with a comment. The creation of the tuple can now be done with an explicit assignment of the parameter name and its value. And the access of the elements will also be done by using the parameter name. The resulting source code is now very easy to understand. You can read the code and will understand it immediately without the need of looking for some more information in other files. For example, it is no longer necessary to look for the type definition and an explanation of the elements. By using the index enumerator you can create “clean” tuples.

Veröffentlicht unter .NET, C++, Clean Code | Kommentar hinterlassen

Design patterns: Singleton

By using the Singleton design pattern you ensure that a class has only one instance. The instance operation is provided by the class itself. So it maintains its own unique instance.

The .NET framework offers a very simple and thread-safe way to create a Singleton. The following source code shows an according example, how to create a singleton instance for logging purposes.

At first we define a interface.

public interface ILogger
{
    void Log(string message);
}

Second we can implement the Singleton object. The class instance is provided by a property which returns a read only object. The .NET framework ensures a thread-safe and lazy creation of the class instance.

public class Logger : ILogger
{
    private static readonly Logger _instance = new Logger();

    private Logger()
    {
    }
           
    public static Logger Instince
    {
        get
        {
            return _instance;
        }
    }            

    public void Log(string message)
    {
        throw new NotImplementedException();
    }        
}

And the last code example shows how to use the Singleton within an demo application.

static void Main(string[] args)
{
    ILogger logger;

    logger = Logger.Instince;

    logger.Log("foo");
}
Veröffentlicht unter .NET, C#, Design Pattern | Kommentar hinterlassen