ctor types in C++

In C++ you will find several ways to initialize an object instance. For example, think about a class “MyClass” which can be constructed with a parameter. The object initialization can be done in several ways:

  • MyClass x{y};
  • MyClass x(y);
  • MyClass x = y;
  • MyClass x = {y};

But which one should be used? Do they all call the same constructor (ctor) or do these initializations lead to several results? Within the next two articles I want to think about these questions. The first article will show the several types of possible constructors and the second article will show the ways to initialize object instances.

Example object

For this article we want to create a simple class. An important task of a class is the resource management. So, the example class will contain a dynamically created memory resource which should be created by the different ctor types and released by the destructor (dtor).

Default ctor

Let’s start with the default ctor and the dtor. The default ctor does not have any parameters and it is used to initialize the class internal members.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
class MyClass
{
public:
MyClass();      // default ctor
~MyClass();     // dtor

private:
int mSize;
int* mElements;
};

MyClass::MyClass()
: mSize(0)
, mElements(nullptr)
{
std::cout &lt;&lt; "default ctor" &lt;&lt; std::endl;
}

MyClass::~MyClass()
{
if (mElements)
{
delete[] mElements;
mElements = nullptr;
}
}

int main()
{
MyClass test1;      // default ctor
<pre><code>return 0;</code></pre>
}

Parameterized ctor

If we want to initialize the internal members with variable parameters, we can use a parameterized ctor. For example we can add a parameterized ctor to set the initial size of the data container.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
class MyClass
{
public:
MyClass();      // default ctor
~MyClass();     // dtor
<pre><code>MyClass(const int size);        // parameterized ctor</code></pre>
private:
int mSize;
int* mElements;
};

MyClass::MyClass(const int size)
: mSize(size)
, mElements(mSize ? new int[mSize]() : nullptr)
{
std::cout &lt;&lt; "parameterized ctor" &lt;&lt; std::endl;
}

Copy ctor and copy assignment operator

Another often needed functionality is to create a copy of an existing object. This can be done by using a copy constructor. Furthermore, a copy assignment should be provided as a developer may use both ways to copy an object: copy it during creation or copy it by an assignment.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
<h1>include </h1>
class MyClass
{
public:
MyClass();      // default ctor
~MyClass();     // dtor
<pre><code>MyClass(const int size);        // parameterized ctor

MyClass(const MyClass&amp; obj);        // copy ctor
MyClass&amp; operator=(const MyClass&amp; obj);     // copy assignment operator</code></pre>
private:
int mSize;
int* mElements;
};

MyClass::MyClass(const MyClass&amp; obj)
: mSize(obj.mSize)
, mElements(new int[obj.mSize])
{
std::cout &lt;&lt; "copy ctor" &lt;&lt; std::endl;
<pre><code>// create deep copy
std::copy(obj.mElements, obj.mElements + mSize, stdext::make_checked_array_iterator(mElements, mSize));</code></pre>
}

MyClass&amp; MyClass::operator=(const MyClass&amp; obj)
{
std::cout &lt;&lt; "copy assignment operator" &lt;&lt; std::endl;
<pre><code>// Self-assignment detection
if (this == &amp;obj)
{
    return *this;
}

// release resources
if (mElements)
{
    delete[] mElements;
    mElements = nullptr;
}

// create deep copy
mSize = obj.mSize;
mElements = mSize ? new int[mSize]() : nullptr;

std::copy(obj.mElements, obj.mElements + mSize, stdext::make_checked_array_iterator(mElements, mSize));

return *this;</code></pre>
}

As you can see, things become more difficult now. If we copy an object we must pay attention to several thins. At first, we should decide whether we want to create a deep or a flat copy. At next we must think about the resource management. This can be seen in the implementation of the copy assignment operator. It contains a self-assignment detection and prior to the resource creation we should releases the existing resources.

Move ctor and move assignment operator

The move ctor and operator should create a copy too. But in contrast to the copy operator we get an r-value as parameter and know that the source object is a temporary object only and will no longer be used. This will allow a more efficient resource management. As the source object in no longer used we can steal its resources instead of creating new ones.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
<h1>include </h1>
class MyClass
{
public:
MyClass();      // default ctor
~MyClass();     // dtor
<pre><code>MyClass(const int size);        // parameterized ctor

MyClass(const MyClass&amp; obj);        // copy ctor
MyClass&amp; operator=(const MyClass&amp; obj);     // copy assignment operator

MyClass(MyClass&amp;&amp; obj);     // move ctor
MyClass&amp; operator=(MyClass&amp;&amp; obj);      // move assignment operator</code></pre>
private:
int mSize;
int* mElements;
};

MyClass::MyClass(MyClass&amp;&amp; obj)
{
std::cout &lt;&lt; "move ctor" &lt;&lt; std::endl;
<pre><code>// steal content of other object
mSize = obj.mSize;
mElements = obj.mElements;

// release content of other object
obj.mSize = 0;
obj.mElements = nullptr;</code></pre>
}

MyClass&amp; MyClass::operator=(MyClass&amp;&amp; obj)
{
std::cout &lt;&lt; "move assignment operator" &lt;&lt; std::endl;
<pre><code>// Self-assignment detection
if (this == &amp;obj)
{
    return *this;
}

// release resources
if (mElements)
{
    delete[] mElements;
    mElements = nullptr;
}

// steal content of other object
mSize = obj.mSize;
mElements = obj.mElements;

// release content of other object
obj.mSize = 0;
obj.mElements = nullptr;

return *this;</code></pre>
}

Again, we will add a self-assignment detection and release old resources. Furthermore, we have to reset the resources of the source object after we stole them. This will prevent the source object dtor to release these resources.

Copy & swap idiom

The copy ctor and the copy assignment operator as well as the move ctor and the move assignment operator contain some duplicate source code. There exists a common implementation technique which addresses this issue: the copy & swap idiom. This implementation technique comes with the advantage to remove this duplicate code but it will have some disadvantages too. Within this article I don’t want to explain the copy & swap idiom because it’s an own complex topic but you should keep in mind that this idiom is existing.

Initializer-list

Another important ctor type, especially for container like object, is a ctor with an initializer list. This will allow to pass an array of objects which is used to initialize the container class.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
<h1>include </h1>
<h1>include </h1>
class MyClass
{
public:
MyClass();      // default ctor
~MyClass();     // dtor
<pre><code>MyClass(const int size);        // parameterized ctor

MyClass(const MyClass&amp; obj);        // copy ctor
MyClass&amp; operator=(const MyClass&amp; obj);     // copy assignment operator

MyClass(MyClass&amp;&amp; obj);     // move ctor
MyClass&amp; operator=(MyClass&amp;&amp; obj);      // move assignment operator

MyClass(const std::initializer_list&lt;int&gt;&amp; list);    //initializer_list ctor</code></pre>
private:
int mSize;
int* mElements;
};

MyClass::MyClass(const std::initializer_list&amp; list)
: mSize(list.size())
, mElements(mSize ? new int[mSize]() : nullptr)
{
std::cout &lt;&lt; "initializer_list ctor" &lt;&lt; std::endl;
<pre><code>if (list.size())
{   
    std::copy(list.begin(), list.end(), stdext::make_checked_array_iterator(mElements, mSize));
}</code></pre>
}

Use the different ctor types

The following source code contains an example console application which will create object instances. Depending on the given parameters the according ctor type is called.

int main()
{
MyClass test1;      // default ctor
<pre><code>MyClass test2(7);       // parameterized ctor

MyClass test3(test1);       // copy ctor
test3 = test1;                  // copy assignment operator

MyClass test4(std::move(test1));    // move ctor
test4 = std::move(test2);                   // move assignment operator

MyClass test5(7.1);     // parameterized ctor, warning: conversion from double to int

MyClass test6{ 1,2,3,4,5 };         //initializer_list ctor

return 0;</code></pre>
}

Implicit conversion ctor vs. explicit ctor

So far, we have implemented a couple of ctor’s for MyClass. With that ctor’s in mind do you think the following line of code will construct an object instance or will it show an error? If an object instance is created, which type of ctor is used?

MyClass test = 7;

This line of code will create a MyClass instance by calling the parameterized ctor. The parameterized ctor is also called a “conversion” ctor because it will allow implicit type conversion. In our case the MyClass instance is created based on an int value, so you can say the int value will implicit converted to an MyClass by calling the according parameterized ctor.

But maybe you don’t want to support such an implicit conversion. That may have several reasons. In my opinion such an implicit conversion looks a little bit strange and there are a lot of developers which don’t know the technical background about this line of code. Most will know that an object is created but some don’t know which ctor is used or whether it is a combination of ctor and assignment. This uncertainty and side effects on code changes may results in errors too. Therefore, you may prevent implicit conversion for some kinds of classes. In this case you have the possibility to declare the parameterized ctor as explicit. If you do so, the ctor cannot be used as implicit conversion ctor. The following source code shows an according example.

<h1>include "stdafx.h"</h1>
<h1>include </h1>
<h1>include </h1>
class MyClass1
{
public:
MyClass1() { std::cout &lt;&lt; "default ctor" &lt;&lt; std::endl; };
MyClass1(const int size) { std::cout &lt;&lt; "parameterized ctor" &lt;&lt; std::endl; };
MyClass1(const MyClass1&amp; obj) { std::cout &lt;&lt; "copy ctor" &lt;&lt; std::endl; };
};

class MyClass2
{
public:
explicit MyClass2() { std::cout &lt;&lt; "default ctor" &lt;&lt; std::endl; };
explicit MyClass2(const int size) { std::cout &lt;&lt; "parameterized ctor" &lt;&lt; std::endl; };
explicit MyClass2(const MyClass1&amp; obj) { std::cout &lt;&lt; "copy ctor" &lt;&lt; std::endl; };
};

int main()
{
MyClass1 test1 = 7;         // OK; calls parameterized ctor
MyClass1 test2 = 7.1;       // OK; calls parameterized ctor, warning: conversion from double to int
<pre><code>// MyClass2 test3 = 7;          // ERROR; implicit conversion from int to MyClass2 is not allowed
// MyClass2 test4 = 7.1;        // ERROR; implicit conversion from double to MyClass2 is not allowed
MyClass2 test6 = MyClass2(7);       // OK; calls parameterized ctor
MyClass2 test7 = MyClass2(7.1); // OK; calls parameterized ctor, warning: conversion from double to int

return 0;</code></pre>
}

MyClass2 offers an explicit ctor only. Therefore, the object instantiation cannot be done by using implicit conversion. Instead the parameterized ctor must be used explicitly. This may result in cleaner source code and may prevent errors.

Summary and outlook

Within this article we have seen the different types of constructors and got an introduction how to use them, for example to manage the resources needed by the object instance. Within the next article we will see the different ways to create an object and see which ctor is used in which situation.

Werbeanzeigen
Veröffentlicht unter C++ | Kommentar hinterlassen

Discard of return values and out parameters in C# 7

C# 7 allows to discard return values and out parameters. The underscore character is used as wildcard for these not required values.

The following source code shows an example of a function call where the caller wants to ignore the return value. Of course, in this case an assignment is not needed at all but it is possible by using the wildcard character.

static void Main(string[] args)
{
  DoSomething();
  _ = DoSomething();
}

private static int DoSomething()
{
  return 1;
}

Let’s stop for a moment and think about the example. This easy example may raise some questions:

  • Which of both syntax styles should be preferred?
  • Do we have any disadvantages if we use the short syntax?
  • Is it fine to ignore a return value at all or should we always assign and analyze return values?

If you read source code which contains the short syntax “DoSomething()” for a function call with ignored return value you will miss a very important fact: the function has a return value! Ignoring the return value may lead to errors. So, this code is a source of errors and the error is hidden as the reader of the code must explicitly have a look at the interface to see whether the function has a return value.

The syntax with the discard character explicitly contains the information that the function has a return value which is ignored. Of course, there is still an open question: why do we ignore the return? And as usual we should explain the “why” within a comment in the source code.

So, the second syntax is a little bit longer but it provides important information. The first syntax instead hides information and is a source for errors. It would be nice to get a compiler warning in such cases.

Discard ternary operator result

Let’s look at a second example where the compiler actually prevents ignoring of returns, the ternary operator. If you try to use the ternary operator without assignment of the return value, you will get a compiler error.

static void Main(string[] args)
{
  bool x = true;

  int y = (x == true) ? DoSomething() : DoSomethingOther(); //OK

  (x == true) ? DoSomething() : DoSomethingOther(); // compiler error

  _ = (x == true) ? DoSomething() : DoSomethingOther();   // OK
}

private static int DoSomething()
{
  return 1;
}

private static int DoSomethingOther()
{
  return 2;
}

Of course, the new discard wildcard can be used in this case too. But again, if we read such source code, we should ask why the return is ignored.

Discard out parameter

Beside the return value a function may have out parameters. As out parameters are nothing else as additional returns we can use the wildcard character for these parameters too.

static void Main(string[] args)
{
  DoSomething(out int _);
}

private static void DoSomething(out int x)
{
  x = 1;
}

Discard tuple elements

If the return value of a function is a tuple, we could discard the whole return value or parts of the tuple.

static void Main(string[] args)
{
  _ = DoSomething();
  (int result, _) = DoSomething();
}

private static (int result, int errorCode) DoSomething()
{
  return (5, 0);
}

Summary

It is possible to ignore return values or out parameter by using the discard parameter. But in my opinion, this is bad coding style. Ignoring these values means either the interface is strange as it contains unnecessary elements or the user of the interface mistakenly ignores important elements. Therefore, the discard parameter should be used in exceptional cases only and a clarifying comment is mandatory which explains why the interface element can be ignored.

Veröffentlicht unter .NET, C# | Kommentar hinterlassen

Tuples in C# 7 (ValueTuple)

Within this article I want to give a short overview over the new Tuple features in C# 7. So far, we already had a Tuple class in C#.  But with the new one some technical details have changed and some syntactical sugar was added. This new tuple syntax makes the source code more readable and perfectly integrates the technical advantages of tuples into the language. But of course, from a software design point of view tuples do have some common disadvantages too which are still present. Therefore, within this article I want to introduce the new tuple, the new syntax, show the technical background, mention the pros and cons and give hints when to use and when to avoid tuples.

Tuple syntax

Let’s start with a base example. We want to implement a function which may fail for several reasons. As we expect such failures the incomplete execution is a normal case which we want to evaluate and do according actions like repeat the method call. So, we don’t want to throw exceptions but return an according error code. Beside this error code the function returns a result. So, the function has two outputs: the function result and the error code as additional execution information. A common design pattern is to use an out parameter for the additional information.

static void Main(string[] args)
{
int errorCode;
var x = DoSomething(out errorCode);

if (errorCode == 0)
{
double y = x * 5;
}
}

private static double DoSomething(out int errorCode)
{
errorCode = 0;
return 4.2;
}


This common design pattern has one major disadvantage: it creates a complex data flow. The standard design of a function has a straight forward data flow: you have one or more input parameters which are passed on method call and you have one function result which will be assigned to a result variable. With output parameters you create a data flow which is way more complex. Would it be nice to return all outputs as function result to go back to the straight forward data flow? This can be done with tuples. The following source code shows the adapted example.

static void Main(string[] args)
{
var x = DoSomething();

if (x.Item1 == 0)
{
double y = x.Item2 * 5;
}
}

private static (double, int) DoSomething()
{
return (4.2, 0);
}

Within this source code the new tuple syntax is used so we don’t have to explicitly define a tuple instance. On function declaration we can define several return parameters and the compiler will create an according tuple. If we call the function we can assign the result to a variable and access the tuple parameters by the parameters “Item1” to “ItemX”.

This is a nice first step into the right direction. But it really bothers to call the tuple members by the generic properties with name “ItemX”. Of course, the new tuple syntax takes care of this aspect and allows to name the return values.

static void Main(string[] args)
{
var x = DoSomething();

if (x.errorCode == 0)
{
double y = x.result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

These named properties are a huge improvement for the code quality and will greatly increase the readability. But there is still one aspect which bothers me. On function declaration I don’t have to implement a tuple but just write down a list of returns. But on function call I must rethink and use a tuple now which holds the returns as properties. Would it be nice to use a value list on function call too? Of course, it would and fortunately this is supported by the new syntax.

static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();

if (errorCode == 0)
{
double y = result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

This final version of the function implementation shows the big strength of the new syntax. We can implement a function with several return values and a straight forward data flow. Furthermore, the readability of the source code is highly increased as the technical details behind this concept are hidden and the source code is focused on functionality only.

Deconstruction of Tuples

As we have seen, on function call we can assign the several function return values to directly to according variables. This is possible because the returned tuple will be deconstructed and the tuple properties get assigned to the variables. Within a previous article you can find an introduction into the deconstruction feature which was also introduced with C#

As mentioned in the linked article, this deconstruction comes with a big issue. You can easily mix up return values if they have the same type. Within this example the compiler will show an according error if you switch the variables because they have different types. But with same types you may run into this issue. Later on, we want to think about the suitable use cases for tuples and we will evaluate this issue within the context of these use cases.

Discard of Tuple parameters

C# 7 allows to discard return values and out parameters. The underscore character is used as wildcard for these not required values.

The following source code shows how to use the discard character to ignore one of the return values.

static void Main(string[] args)
{
(double result, _) = DoSomething();

double y = result * 5;
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

As mentioned within the linked article you should use this feature in rare cases only. To ignore a function result indicates that there is some software design issue within the function interface or in the is something wrong in the code which uses the interface.

When to use a tuple

As we have seen so far, tuples come with some pros and cons. So, we should not use them thoughtless. There are many situations were tuples are not the best choice. In my opinion there are only a few use cases were tuples should be used.

Sometimes you want to return more than one value from a method. This is a common use case but in a good interface design it should be a rare case too. Functions should do one thing only. Therefore, they have one result only. But sometimes additional to the result – what the function has done – it may be necessary to return execution information – how the method has been executed. These are for example error codes, performance measurement data or statistical data. This execution information contains very technical data about internal operations. With such an interface the component will become a gray box as it shows internal details and expect the right reactions of the user according this internal information.

As detail hiding is one of the base object-oriented concepts, such interfaces should be avoided. You want to have easy to use interfaces and your component should be a black box. In my opinion this is mandatory for high level interfaces.

On low level interfaces you may have a slightly different situation and explicitly want to get detail information about the component you use. You want to use it as gray box or even as white box. Furthermore, for performance reasons you may break some object-oriented rules or even mix object-oriented paradigms with functional and procedural paradigms. In such low-level interfaces, it is very common to have methods with several return values. But even in these cases I recommend having one result only – what the function has done – and one or more internal information about how it was done.

The options so far to implement several returns were less than optimal.

  • Out parameters: use is clunky and creates a method interface with a complicated data flow as in and out streams are mixed up
  • Custom-built transport type for every method: a lot of code overhead for a type which is used for one method only as temporarily object to group a few values
  • Anonymous types: high performance overhead and no static type checking
  • System.Tuple: best choice so far but with the need to allocate an object and with the disadvantage of code which needs comments to be easily readable (you must explain the tuple parameters)

With C# 7.0 we are now able to use the new tuple with the nice implementation syntax inspired by functional programming. It will make out parameters obsolete. This new tuple isn’t syntactical sugar only (compared with existing Tuple class) but also an improvement from technical side. We will analyze these technical details later.

You are now able to bundle up the different return parameters into one tuple element and use it as return value. You create a loose coupling between these values. Furthermore the coupling should exist in a small context only. The tuple is a temporary object with a short lifetime. A common pattern would be to construct, return and immediately deconstruct tuples.

In summary I would give following guidelines how to use tuples:

Methods in high level interfaces should return the result only. As they will hide internal details there is no need for additional output. Low level methods may return additional execution information. In this situation a tuple can be used. The return value and the execution information can be coupled within the tuple. This loose coupling exists for data transfer only as the tuple has a short lifetime and will be constructed, returned and immediately deconstructed.

As we have already seen at the beginning of the article, the biggest disadvantage of tuples is the risk to mix up parameters and create hard to find errors. If we use tuples in low level interfaces only, we can weaken this disadvantage. Low level components are gray or white boxes and the user has a high detail knowledge about the internals of such components. This reduces the risk to mix up parameters and it increases the chance to find such errors quickly. So, if we use tuples in these use cases only, the advantage of the good readability of the source code exceeds the disadvantage of the risk to mix up parameters.

Internals of tuple

As seen so far, the new tuple comes with a nice syntax. Of course, this is just syntactical sugar which helps to write clean code. So, we want to have a look behind the syntax and see what is done in the background. Let’s use the example with the named tuple elements and with deconstruction of the tuple.

static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();

if (errorCode == 0)
{
double y = result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

If we disassemble the created application we will see the following code (disassembling with jetBrains dotPeek).

private static void Main(string[] args)
{
ValueTuple valueTuple = Program.DoSomething();
double num1 = valueTuple.Item1;
if (valueTuple.Item2 != 0)
return;
double num2 = num1 * 5.0;
}

[return: TupleElementNames(new string[] { "result", "errorCode" })]
private static ValueTuple DoSomething()
{
return new ValueTuple(4.2, 0);
}

This code shows some very important technical details. The used tuple is of the new type “ValueTuple” and not the longer exiting “Tuple”. “ValueTuple” is a struct, it is a value type and it has member values. “Tuple” instead is a class, a reference type and has properties. The “ValueTuple” is a lightweight type which comes with a better performance in nearly all cases. One exception is an assignment copy of a struct. As the whole content of the struct must be copied it is more expensive than a class copy which just copies the object reference. So, if you want to use a tuple with many elements and you have to copy this tuple very often then you might use a tuple class. Instead of the new ValueTuple struct. But as described in the previous paragraph, I would not recommend such a software design. Tuples should be used as temporary transport containers within a small context. If you must need a long living instance you should use an own data class instead of a tuple.

The second fact we see within the decompiled code is the naming mechanism of the tuple members. The names of the members are used within the IDE only. They are used to increase readability but they are not part of the resulting source code. So, you don’t have to fear expensive string comparisons if you use named tuple elements.

Tuple as function parameter

A tuple is a normal value type. So, you can pass a tuple as parameter to a function.

static void Main(string[] args)
{
DoSomething((4.2, 1));
}

private static void DoSomething(ValueTuple x)
{
double y = x.Item1 * x.Item2;
}

But should we use a tuple as function parameter just because it is possible from a technical point of view. I don’t think so. From software design point of view, I don’t see a use case for this feature.

Summary

The new tuple introduced with C#7, in combination with the new tuple syntax is a nice and powerful improvement of the C# language. It offers an efficient way to implement methods with several return values.

Veröffentlicht unter C# | Kommentar hinterlassen

Software Transactional Memory

One of the main challenges in multithreading applications is the access to shared memory. In case you have several software-components which want to read and write to shared memory you must synchronize these data access. The concept of Software Transactional Memory (STM) is based on the idea to use a concurrency control mechanism analogous to database transactions. A transaction in this context is a series of reads and writes to shared memory. These reads and writes are logically connected and should be executed in an atomic way, which means that intermediate states are not visible to other components. The transaction concept is an alternative to lock-based synchronization.

Let’s look at the following example to understand the concept: We have two list which store some data. Now we want to transfer one value from the first list to the second list. At the same time some other parallel processes may access the lists too. These other components may want to read or write some data. The transactional concept will guarantee a consistent state of the data objects. The task which wants to execute the movement of the value from one list into another list will implement execute the delete and insert operations as a transaction. In case another task reads the data at this moment it will only see the lists before the movement or after the movement. It will never see the intermediate state where the data value is deleted in one list but not yet inserted into the other list.

 

STM in C++

Unfortunately, STM is not yet part of standard C++ programming language. It is on the list of possible C++20 features but as there are some other features with higher priority it is not for sure that STM will be part of the standard soon. Nevertheless, the concept is well defined and there exist an experimental implementation of STM in GCC 6.1. Furthermore, there are a lot of third party libraries which will allows to use STM in C++ applications.

Within this article I want to introduce the concept in general and don’t want to show the implementation based on a specific one of these third-party libraries. Therefore, the source code of this article is based on the keyword introduced for the experimental implementation. You will find an introduction to these keywords within the cpp reference.

So please keep in mind that the shown example implementations cannot be compiled with a standard compiler. You can use a compiler which contains an experimental implementation of the STM or you may use a third-party library and adapt the example implementation accordingly.

 

Transactional Concept

A transaction is an action with the following characteristics: Atomicity, Consistency, Isolation and Durability (ACID). The STM in C++ will have these characteristics, except of durability. Based on the example above, I want to explain the remaining three concepts. We want to implement a transaction which contains two statements: delete a value from one list and insert the value into another list.

Atomicity

The two statements will be executed together ore none of them is executed. The transaction covers all the statements and therefore the transaction acts like a single statement.

 

Consistency:

The system is always in a valid state. Either all the statements within a transaction are executed or none of the statements is executed. The system will never be in a state where a part of the statements was executed only.

 

Isolation:

Each transaction is executed independently of other transactions.

 

Execution of a Transaction

As mentioned at the beginning of the article, the transaction concept is an alternative to lock-based synchronization. Therefore, transactions are implement in a way to ensure the characteristics shown above without the need of lock-mechanisms.

A transaction stores its starting state and executes all statements without lock-mechanisms. In case there is a conflict during transaction execution, it will be stopped and a rollback to the starting state is done. Afterwards the transaction will be executed again. If all statements are executed successful, the transaction will check whether the start conditions are still valid, e.g. no one has changed the data in the meantime. If the start conditions are still valid, the transaction will be published.

A transaction is a speculative action. In contrast to a lock-based mechanism, it follows an optimistic approach. The transaction will be executed without any synchronization but will be published only in case the start conditions are still valid. A lock-based mechanism instead follows a pessimistic approach. It ensures the exclusive access to the critical section and waits in case some other task currently holds the access rights to this critical section. The optimistic approach does not need this expensive lock-mechanism but of course, permanent rollbacks due to frequent data collisions may be expensive too. Therefore, it depends on the use case whether STM or lock-based synchronization is faster.

 

Performance

STM may increase the performance in many use cases. Unlike the lock-based techniques used in most modern multithreaded applications, STM follows an optimistic approach. A thread completes modifications to shared memory without regard for what other threads might be doing. It records every read and write within a log and stores the starting conditions to detect possible conflicts. Of course, this kind of data management isn’t for free but it creates less overhead than locking mechanisms. Additional, in case of data conflicts, a rollback is done and the transaction will be repeated. Therefore, the performance of STM is very much related to the likelihood of data conflicts.

The benefit of this optimistic approach is increased concurrency. No thread needs to wait for access to the shared memory. Different threads can safely and simultaneously modify parts of the shared memory which would otherwise be protected under the same lock in lock-based implementations.

 

Implementation

It is very easy to use STM in your implementations. No matter if you use a STM library or the upcoming standard language feature, normally you just must put the operations which belong together in a transaction, by enclose them with an according block (we will see examples later). But of course, you must respect the technical conditions of the STM concept. If you read and write data objects within transactions, you should not access these data objects outside of transactions too. Such data manipulations in not synchronized task can lead to data races. Furthermore, within your transactions, you are limited to transaction-safe functions. As we learned so far, the STM concept works with an optimistic approach and in case of conflicts a transaction rollback is done. Of course, this is possible only in case the functions called within the transaction can be rolled back. For example the std::cout function is not transaction-safe and cannot be used within transactions.

 

Example

Now we would look at an example which uses the experimental STM implementation of C++. Please keep in mind that the following source code will create compiler errors in standard compilers. You must use a compiler which has an implementation of this experimental feature or you may use on of the many third-party libraries for STM and adapt the example accordingly.

Within the example application we would look at typical use cases: change a value and execute some function based on the value. This sounds very easy but has the well-known pitfalls if we change to a multithreaded application. So, let’s say we have two variables. Within a function we want to change the variables values and afterwards we want to read the values and pass them to a function – in this case a standard cout function.

int x, y;

using namespace std::chrono_literals;

void Execute()
{
	x++;
	y--;

	std::cout << "(" << x << "," << y << ")";

	std::this_thread::sleep_for(100ns);
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);
	
	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) Execute(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}
	
	return 0;
}

 

If we execute this application we will see some strange issues. It looks like some of the variable increase and decrease steps get lost as the final values are not like expected and furthermore the output of cout may be muddled up.

These issues happen because there are data races between the different threads. A thread may execute the value increase or decrease at a moment where another thread already executed such a data change. So, the second thread works on partially changed data. The same concurrency issue occurs for the cout function. It may be partially executed by one thread and then interrupted by another thread.

The actual STM concept idea offers two concepts to solve these issues: Synchronized Blocks and Atomic Blocks. Following we will see and compare the two concepts and adapt the example application accordingly.

 

Synchronized Blocks

The STM implementation contains the concept of synchronized blocks. A synchronized block allows to combine different statements within this block. It is executed in a way that behaves like it is secured by one global lock-statement. Maybe it is implemented with a more performant mechanism but it must behave in the same way like a lock.

The example can be changed very easily. We just have to write the statements into a synchronized block. Now the application behaves like expected. The variables will be increased and decreased and the output looks fine.

int x, y;

using namespace std::chrono_literals;

void ExecuteSynchronized()
{
	synchronized{
		x++;
		y--;

		std::cout << "(" << x << "," << y << ")";

		std::this_thread::sleep_for(100ns);
	}
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);

	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) ExecuteSynchronized(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}

	return 0;
}

 

As synchronized blocks behave like as they are synchronized by a global lock, the different blocks will be executed successively. So we have the behavior of a lock-based mechanism and no real STM. STM is based on optimistic locking and transactions and allows a parallel execution of different threads. Synchronized blocks follow a pessimistic lock-based approach and will execute the blocks successively.

This sound like a disadvantage in the first moment but it will allow us to use transaction unsafe code within the synchronized blocks. For example, the “cout” function is transaction unsafe but we can still use it. In my opinion there are two big advantages of synchronized blocks. At first you can change any existing source code which is not thread safe into thread safe code just by enclosing the existing statement by a synchronized block. And second, if you use real STM, like we will see in the next example, you are able to change from real STM to the lock-based synchronization block just by change one statement. This will allow you to test both concept and use the better one according to your use case. Furthermore, you can change from real STM to synchronized blocks in case you want to use a transaction unsafe function.

 

Atomic Blocks

Using atomic blocks is nearly as simple as using synchronized blocks. We just have to add the according block which encloses the critical statements. But additional we must remove the “cout” function now as this function is not transaction safe. So, we have to change the output function and for example move it behind the thread execution or we can write into an buffer during thread execution and output this buffer within a parallel task.

int x, y;

using namespace std::chrono_literals;

void ExecuteAtomic()
{
	atomic_noexcept{
		x++;
		y--;

		std::this_thread::sleep_for(100ns);
	}
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);

	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) ExecuteAtomic(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}

	return 0;
}

 

Atomic blocks exist in three variations which differ in exception handling:

  • atomic_noexcept: If an exception occurs, std::abort is called and the application will be aborted.
  • atomic_cancel: If a transaction-safe exception occurs, the transaction will be canceled and rolled back and the exception is rethrown.
  • atomic_commit: If a transaction-safe exception occurs, the transaction will be committed and the exception is rethrown.

 

Summary

The Software Transactional Memory is a performant and easy to use concept to solve data races in multithreading applications. In the (near) future STM will be integrated in the C++ standard. In the meantime, you can use existing third party libraries to add STM features.

Veröffentlicht unter C++ | Kommentar hinterlassen

std::atomic

Within multithreading applications even trivial functions like reading and writing values may create issues. The std::atomic template can be used in such situations. Within this article I want to give a short introduction into this topic.

 

Atomic operations

Let’s start with an easy example. Within a loop we will increase a value by using the “++” operator. The loop will be executed by several threads.

int value;

void increase()
{
	for (int i = 0; i < 100000; i++)
	{
		value++;
	}
}

int main()
{
	std::thread thread1(increase);
	std::thread thread2(increase);
	std::thread thread3(increase);
	std::thread thread4(increase);
	std::thread thread5(increase);

	thread1.join();
	thread2.join();
	thread3.join();
	thread4.join();
	thread5.join();

	std::cout << value << '\n';

  return 0;
}

 

We expect an output of “500000” but unfortunately the output is below this value and it will be different on each execution of the application.

That’s because the “++” operation is not atomic. To keep it simple we can say this operation consist of three steps: read value, increase value and write value. If these steps are executed in parallel, several threads may read the same value and increase it by one. Therefore, the result is much lower than expected.

The same issue may occur on simple read and write mechanisms too. For example, thread 1 sets a value and thread 2 reads a value. In case the value type does not fit into a processor word, even such trivial read and write operations are not atomics. Accessing a 64bit variable in a 32bit system may result in incomplete values.

In such cases we need a synchronization mechanism to prevent the parallel execution. Of course, we could use standard lock mechanism. Or we could use the std::atomic template. This template defines an atomic type and guarantees atomicity of the operations of this type. To adapt the previous example, we just have to change the data type.

std::atomic<int> value;

void increase()
{
	for (int i = 0; i < 100000; i++)
	{
		value++;
	}
}

int main()
{
	std::thread thread1(increase);
	std::thread thread2(increase);
	std::thread thread3(increase);
	std::thread thread4(increase);
	std::thread thread5(increase);

	thread1.join();
	thread2.join();
	thread3.join();
	thread4.join();
	thread5.join();

	std::cout << value << '\n';

	return 0;
}

 

Now the result is like expected. The output of the application is “500000”.

 

std::atomic

As mentioned before, the std::atomic template defines a atomic type. If one thread writes to an atomic object while another thread reads from it, the behavior is well-defined. The operation itself has a guaranteed atomicity. Furthermore, read and writes to several different objects can have a sequential consistency guarantee. This guarantee is based on the selected memory model. We will see according examples later.

 

Read and write values

A common scenario in multithreading applications is the parallel read and write of variables by different threads. Let’s create a simple example with two threads, one is writing variable values and the other one reads these values.

int x;
int y;

void write()
{
	x = 10;
	y = 20;
}

void read()
{
	std::cout << y << '\n';
	std::cout << x << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

  return 0;
}

 

Within this kind of implementation, the behavior is undefined. Depending on the order of the threads the output may be “20/10”, “0/0”, “0/10” or “20/0”. But beside these expected results it may happen that a read is done during a write and a therefore an incomplete written value is used. Within the example this should not happen but ex described before depending on the used processor and value types it may happen. Therefore, we can say that the behavior of the application is undefined.

By using the std::atomic template we could change the undefined behavior to a defined on. We just have to define the variables as atomic and use the according read and write functions.

std::atomic<int> x;
std::atomic<int> y;

void write()
{
	x.store(10);
	y.store(20);
}

void read()
{
	std::cout << y.load() << '\n';
	std::cout << x.load() << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

  return 0;
}

 

Now, the behavior of the application is well defined. Independent from the used processor and independent from the data type (we could change from “int” to any other type), we have a defined set of results. Depending on the order of the threads the output may be “20/10”, “0/0”, “0/10”. The output “20/0” is not possible because the default mode for atomic loads and stores enforces sequential consistency. That means, within this example, x will always be written before y is changed. Therefore, the output “0/10” is possible but not “20/0”. As the std::atomic template ensures an atomic execution of the read and write functions we don’t have to fear undefined behavior due to incomplete data updates. So, we have a defined behavior with three possible results.

 

Memory model

As mentioned before, the selected memory model will change the behavior of atomic read and write functions. By default, the memory model ensures a sequential consistence. This may be expensive because the compiler is likely to emit memory barriers between every access. If your application or algorithm does not need this sequential consistency you can set a more relaxed memory model.

For example, if it is fine to get “20/0” as result within our application, we can set the memory model to “memory_order_relaxed”. This removes the synchronization and ordering constraints, but operation’s atomicity is still guaranteed.

void write()
{
	x.store(10, std::memory_order_relaxed);
	y.store(20, std::memory_order_relaxed);
}

void read()
{
	std::cout << y.load(std::memory_order_relaxed) << '\n';
	std::cout << x.load(std::memory_order_relaxed) << '\n';
}

 

Another interesting memory model is the Release-Acquire ordering. You can set the store functions within the first thread to “memory_order_release” and the load functions in the second thread to “memory_order_acquire”. In this case all memory writes (non-atomic and relaxed atomic) that happened before the atomic store in thread 1 are executed before load in thread 2 is executed. This takes us back to the ordered loads and stores, so “20/0” is no longer a possible output. But it does so with minimal overhead and an increased execution performance. Within this trivial example, the result is the same as with the full-blown sequential consistency. In a more complex example with several threads reading and writing all or some of the variables, the result may be different from the default sequential consistency.

void write()
{
	x.store(10, std::memory_order_release);
	y.store(20, std::memory_order_release);
}

void read()
{
	std::cout << y.load(std::memory_order_acquire) << '\n';
	std::cout << x.load(std::memory_order_acquire) << '\n';
}

 

As mentioned before, the Release-Acquire ordering ensures that all memory writes before the atomic store are executed, even non-atomic ones. So we could change the application and use a non-atomic type for x. The atomic store of y still ensures the correct write order.

int x;
std::atomic<int> y;

void write()
{
	x = 10;
	y.store(20, std::memory_order_release);
}

void read()
{
	std::cout << y.load(std::memory_order_acquire) << '\n';
	std::cout << x << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

	return 0;
}

 

But again, in a more complex scenario with several threads and maybe a read of x only, it could be necessary to define the x as atomic.

 

Synchronization of threads

A common implementation technique for thread synchronization are locks. By using the atomic template, you can create such thread synchronizations without locks. Depending on the selected memory model this may increase the performance of your application.

The following application shows an example how to use the atomic template to synchronize the execution of two threads. Thread 2 will wait until thread 1 finished execution. This is done by using a variable which contains the execution state. Furthermore thread 1 commits its results within a data variable. The used Release-Acquire ordering mechanism of the atomic template ensures that the data is written before the synchronization flag is set.

int data;
std::atomic<bool> ready;

void write()
{
	data = 10;
	ready.store(true, std::memory_order_release);
}

void read()
{
	if(ready.load(std::memory_order_acquire))
	{ 	
		std::cout << data << '\n';
	}
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

	return 0;
}

 

Summary

This article gave a short introduction into the std::atomic template based on some common use cases. The examples name some major issues regarding data access in multithreading applications and introduce according implementations based on the atomic template. Of course this article is an introduction only. The atomic template offers some more features, for example there exist more memory model configurations beside the three shown within this article.

Veröffentlicht unter C++ | Kommentar hinterlassen

Deconstruction in C# 7

C# 7 introduces a nice syntax to deconstruct a class and access all members. If you want to support deconstruction for an object you just must write a “Deconstruct” method which contains one or more out parameter which are used to assign class properties to variables. This deconstruction is also supported by the new Tuple struct introduced with C# 7. The following example shows the deconstruction feature used for a tuple and for an own class.

static void Main(string[] args)
{
  (int x, int y) = GetTuple();

  (double width, double height) = GetRectangle();

  Rectangle rect = new Rectangle(1, 2);
  (double a, double b) = rect;
}

private static (int x, int y) GetTuple()
{
  return (1, 2);
}

private static Rectangle GetRectangle()
{
  return new Rectangle(5, 8);
}

class Rectangle
{
  public Rectangle(double width, double height)
  {
    Width = width;
    Height = height;
  }

  public double Width { get; }
  public double Height { get; }

  public void Deconstruct(out double width, out double height)
  {
    width = Width;
    height = Height;
  }
}

 

The assignment of the class properties to the variables is done by the position. So, the variable names and the class property names or tuple element names must not match. Furthermore, it is possible to write several “Deconstruct” methods with different number of elements.

In the first moment this looks like a nice feature which increases the readability of the source code. But it comes with a big issue. As the deconstruction is done be the element positions only it is a source for errors. If you switch elements by mistake, you create an issue which cannot be found by the compiler if the types are equal. This may cost you a lot of debugging time because such a mistake isn’t easy to find. The following code will show this disadvantage. Both lines of code are valid but the second one will result in runtime errors as you mixed up the parameters.

static void Main(string[] args)
{
  (double width, double height) = GetRectangle();

  (double height, double width) = GetRectangle();
}

 

Furthermore, you can get in trouble in case someone changes a class you are using. For example, several class properties get changed as result of a refactoring. So, their names and meanings will be changed. If you use this class within you code and access the class properties your compiler will show you according error messages as the property names have changed. But if you use deconstruction and the number and type of parameters haven’t changed you will get in trouble. Names and meanings of class elements have changed but the deconstruction method does only consider number and type of parameters. So, neither the compiler nor you will suspect any issues if you use the refactored class. But during runtime of your application you can expect some unpleasant surprises followed by hours of debugging and bug fixing.

Let’s look at the example with the rectangle. The developer of this class has done a refactoring and now the rectangle is represented as vector with an angle and a length. But unfortunately, the Deconstruction interface hasn’t changed and so you do not get any compilation errors.

static void Main(string[] args)
{
  (double width, double height) = GetRectangle();
}

private static Rectangle GetRectangle()
{
  return new Rectangle(5, 8);
}

class Rectangle
{
  public Rectangle(double vectorAngle, double vectorLength)
  {
    VectorAngle = vectorAngle;
    VectorLength = vectorLength;
  }

  public double VectorAngle { get; }
  public double VectorLength { get; }

  public void Deconstruct(out double vectorAngle, out double vectorLength)
  {
    vectorAngle = VectorAngle;
    vectorLength = VectorLength;
  }
}

 

In my opinion the deconstruction feature comes with more disadvantages then advantages. The advantage of a little bit shortened source code is way too small compared to the disadvantage of possible hard to find implementation Errors.

Veröffentlicht unter C# | 1 Kommentar

Local or nested functions in C# 7

With C# 7.0 it is possible to create functions nested within other functions. This feature is called “local functions”. The following source code shows an according example.

static void Main(string[] args)
{
  int x = DoSomething();

  //int y = Calc(1, 2, 3);  // compiler error because Calc does not exist within this context
}

private static int DoSomething()
{
  int shared = 15;

  int x = Calc(1, 2, 3);

  return x;

  int Calc(int a, int b, int c)
  {
    return a * b + c + shared;
  }
}

 

The function “Calc” is nested within the function “DoSomething”. As a result, “Calc” is available in the function scope only and cannot be used outside of “DoSomething”. Inside the enclosing function, you can place the nested function wherever you want. Furthermore, the nested function has access to the variables of the enclosed function.

In my opinion local functions are a nice feature. A common software design concept is to hide internal implementation details of software components by limiting the scope of internal components. If you have functions which are needed inside a class only, you can limit the scope to this class by defining a private function. But so far this was the lowest possible scope. If you have created sub-functions which were needed by a single function only, you had to publish these functions on class scope too, or use other techniques like lambda functions. With C# 7 you are now able to limit the scope of such nested functions to the scope of the enclosed function.

I like local functions because you are now able to define a proper scope. But there are two features which I don’t like because they may result in source code which could be difficult to understand. One of these two features is the possibility to place the nested functions wherever you want. In my opinion you should place them always at the beginning or at the end. I prefer the end, after the return statement. This will increase the readability of the source code because at first you have the code of the main function itself and then the code of the internal sub-functions. So, you create a clear separation between these different concerns and you have reading order from the enclosed main function to the nested local functions. The second feature which I don’t like is the possibility to access variables of the enclosed function. Based on a technical point of view this feature is fine, because the nested function is part of the enclosing functions scope and therefore can access elements of this scope. But from a software design point of view it is a questionable feature because you can create a lot of dependencies between the enclosed function and several local functions. This can result in difficult code with a spaghetti like data flow. An alternative is to pass variables as parameters to the local functions and create a clear separation between the different functions and their concerns. In my opinion you should therefore use this feature very wisely and only if it really comes along with some advantages.

A typical use case for local functions are recursive function calls. Often you will have a public main function which should do some calculation. This function may now use recursion function calls to execute the calculation. So, you will create an internal function used for these recursive calls and this internal function often contains parameters with interim results or recursion status information. Therefore, such a function should never be part of the public interface. On the contrary, it may only be used meaningful within the scope of the main function. Now you will have the possibility to implement it according to this need and create a local function. The following example shows an according use case. Within a tree structure we will get the maximum depth of the tree.

static void Main(string[] args)
{
  TreeElement tree = new TreeElement();
  TreeElement sub1a = new TreeElement();
  TreeElement sub1b = new TreeElement();
  TreeElement sub2 = new TreeElement();

  sub1b.SubElements = new List();
  sub1b.SubElements.Add(sub2);

  tree.SubElements = new List();
  tree.SubElements.Add(sub1a);
  tree.SubElements.Add(sub1b);

  int depth = GetDepth(tree);
  Console.WriteLine("Depth: " + depth);
}

public class TreeElement
{
  public List SubElements;
}

private static int GetDepth(TreeElement tree)
{
  return GetDepth(tree, 1);

  int GetDepth(TreeElement subTree, int actualDepth)
  {
    int maxDepth = actualDepth;

    if (subTree.SubElements == null)
    {
      return actualDepth;
    }

    foreach (TreeElement element in subTree.SubElements)
    {
      int elementDepth = GetDepth(element, actualDepth + 1);
      if (maxDepth < elementDepth)
      {
        maxDepth = elementDepth;
      }
    }

    return maxDepth;
  }
}

 

The calculation of the depth is done by recursive function calls which step into the sub elements. If a tree leaf is reached, the recursive function will return its current depth. Therefore, the actual depth of the sub-tree is passed as parameter to the recursive function. Of course, you could use the recursive function in the public interface too but in this case the user of the interface will see the depth parameter too which is needed internally only. You could set the default value of this parameter to “1” and explain it within a documentation but nevertheless you interface gets dirty as it contains parameters which are necessary due to internal needs only. By using a local function, you can offer a clean interface and hide the internal implementation details. Furthermore, this internal helper function is visible within the containing function scope only and will therefore neither be visible in the public interface nor in the private interface of the surrounding class.

Veröffentlicht unter C# | Kommentar hinterlassen