Tuples in C# 7 (ValueTuple)

Within this article I want to give a short overview over the new Tuple features in C# 7. So far, we already had a Tuple class in C#.  But with the new one some technical details have changed and some syntactical sugar was added. This new tuple syntax makes the source code more readable and perfectly integrates the technical advantages of tuples into the language. But of course, from a software design point of view tuples do have some common disadvantages too which are still present. Therefore, within this article I want to introduce the new tuple, the new syntax, show the technical background, mention the pros and cons and give hints when to use and when to avoid tuples.

Tuple syntax

Let’s start with a base example. We want to implement a function which may fail for several reasons. As we expect such failures the incomplete execution is a normal case which we want to evaluate and do according actions like repeat the method call. So, we don’t want to throw exceptions but return an according error code. Beside this error code the function returns a result. So, the function has two outputs: the function result and the error code as additional execution information. A common design pattern is to use an out parameter for the additional information.

static void Main(string[] args)
{
int errorCode;
var x = DoSomething(out errorCode);

if (errorCode == 0)
{
double y = x * 5;
}
}

private static double DoSomething(out int errorCode)
{
errorCode = 0;
return 4.2;
}


This common design pattern has one major disadvantage: it creates a complex data flow. The standard design of a function has a straight forward data flow: you have one or more input parameters which are passed on method call and you have one function result which will be assigned to a result variable. With output parameters you create a data flow which is way more complex. Would it be nice to return all outputs as function result to go back to the straight forward data flow? This can be done with tuples. The following source code shows the adapted example.

static void Main(string[] args)
{
var x = DoSomething();

if (x.Item1 == 0)
{
double y = x.Item2 * 5;
}
}

private static (double, int) DoSomething()
{
return (4.2, 0);
}

Within this source code the new tuple syntax is used so we don’t have to explicitly define a tuple instance. On function declaration we can define several return parameters and the compiler will create an according tuple. If we call the function we can assign the result to a variable and access the tuple parameters by the parameters “Item1” to “ItemX”.

This is a nice first step into the right direction. But it really bothers to call the tuple members by the generic properties with name “ItemX”. Of course, the new tuple syntax takes care of this aspect and allows to name the return values.

static void Main(string[] args)
{
var x = DoSomething();

if (x.errorCode == 0)
{
double y = x.result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

These named properties are a huge improvement for the code quality and will greatly increase the readability. But there is still one aspect which bothers me. On function declaration I don’t have to implement a tuple but just write down a list of returns. But on function call I must rethink and use a tuple now which holds the returns as properties. Would it be nice to use a value list on function call too? Of course, it would and fortunately this is supported by the new syntax.

static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();

if (errorCode == 0)
{
double y = result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

This final version of the function implementation shows the big strength of the new syntax. We can implement a function with several return values and a straight forward data flow. Furthermore, the readability of the source code is highly increased as the technical details behind this concept are hidden and the source code is focused on functionality only.

Deconstruction of Tuples

As we have seen, on function call we can assign the several function return values to directly to according variables. This is possible because the returned tuple will be deconstructed and the tuple properties get assigned to the variables. Within a previous article you can find an introduction into the deconstruction feature which was also introduced with C#

As mentioned in the linked article, this deconstruction comes with a big issue. You can easily mix up return values if they have the same type. Within this example the compiler will show an according error if you switch the variables because they have different types. But with same types you may run into this issue. Later on, we want to think about the suitable use cases for tuples and we will evaluate this issue within the context of these use cases.

Discard of Tuple parameters

C# 7 allows to discard return values and out parameters. The underscore character is used as wildcard for these not required values.

The following source code shows how to use the discard character to ignore one of the return values.

static void Main(string[] args)
{
(double result, _) = DoSomething();

double y = result * 5;
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

As mentioned within the linked article you should use this feature in rare cases only. To ignore a function result indicates that there is some software design issue within the function interface or in the is something wrong in the code which uses the interface.

When to use a tuple

As we have seen so far, tuples come with some pros and cons. So, we should not use them thoughtless. There are many situations were tuples are not the best choice. In my opinion there are only a few use cases were tuples should be used.

Sometimes you want to return more than one value from a method. This is a common use case but in a good interface design it should be a rare case too. Functions should do one thing only. Therefore, they have one result only. But sometimes additional to the result – what the function has done – it may be necessary to return execution information – how the method has been executed. These are for example error codes, performance measurement data or statistical data. This execution information contains very technical data about internal operations. With such an interface the component will become a gray box as it shows internal details and expect the right reactions of the user according this internal information.

As detail hiding is one of the base object-oriented concepts, such interfaces should be avoided. You want to have easy to use interfaces and your component should be a black box. In my opinion this is mandatory for high level interfaces.

On low level interfaces you may have a slightly different situation and explicitly want to get detail information about the component you use. You want to use it as gray box or even as white box. Furthermore, for performance reasons you may break some object-oriented rules or even mix object-oriented paradigms with functional and procedural paradigms. In such low-level interfaces, it is very common to have methods with several return values. But even in these cases I recommend having one result only – what the function has done – and one or more internal information about how it was done.

The options so far to implement several returns were less than optimal.

  • Out parameters: use is clunky and creates a method interface with a complicated data flow as in and out streams are mixed up
  • Custom-built transport type for every method: a lot of code overhead for a type which is used for one method only as temporarily object to group a few values
  • Anonymous types: high performance overhead and no static type checking
  • System.Tuple: best choice so far but with the need to allocate an object and with the disadvantage of code which needs comments to be easily readable (you must explain the tuple parameters)

With C# 7.0 we are now able to use the new tuple with the nice implementation syntax inspired by functional programming. It will make out parameters obsolete. This new tuple isn’t syntactical sugar only (compared with existing Tuple class) but also an improvement from technical side. We will analyze these technical details later.

You are now able to bundle up the different return parameters into one tuple element and use it as return value. You create a loose coupling between these values. Furthermore the coupling should exist in a small context only. The tuple is a temporary object with a short lifetime. A common pattern would be to construct, return and immediately deconstruct tuples.

In summary I would give following guidelines how to use tuples:

Methods in high level interfaces should return the result only. As they will hide internal details there is no need for additional output. Low level methods may return additional execution information. In this situation a tuple can be used. The return value and the execution information can be coupled within the tuple. This loose coupling exists for data transfer only as the tuple has a short lifetime and will be constructed, returned and immediately deconstructed.

As we have already seen at the beginning of the article, the biggest disadvantage of tuples is the risk to mix up parameters and create hard to find errors. If we use tuples in low level interfaces only, we can weaken this disadvantage. Low level components are gray or white boxes and the user has a high detail knowledge about the internals of such components. This reduces the risk to mix up parameters and it increases the chance to find such errors quickly. So, if we use tuples in these use cases only, the advantage of the good readability of the source code exceeds the disadvantage of the risk to mix up parameters.

Internals of tuple

As seen so far, the new tuple comes with a nice syntax. Of course, this is just syntactical sugar which helps to write clean code. So, we want to have a look behind the syntax and see what is done in the background. Let’s use the example with the named tuple elements and with deconstruction of the tuple.

static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();

if (errorCode == 0)
{
double y = result * 5;
}
}

private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}

If we disassemble the created application we will see the following code (disassembling with jetBrains dotPeek).

private static void Main(string[] args)
{
ValueTuple valueTuple = Program.DoSomething();
double num1 = valueTuple.Item1;
if (valueTuple.Item2 != 0)
return;
double num2 = num1 * 5.0;
}

[return: TupleElementNames(new string[] { "result", "errorCode" })]
private static ValueTuple DoSomething()
{
return new ValueTuple(4.2, 0);
}

This code shows some very important technical details. The used tuple is of the new type “ValueTuple” and not the longer exiting “Tuple”. “ValueTuple” is a struct, it is a value type and it has member values. “Tuple” instead is a class, a reference type and has properties. The “ValueTuple” is a lightweight type which comes with a better performance in nearly all cases. One exception is an assignment copy of a struct. As the whole content of the struct must be copied it is more expensive than a class copy which just copies the object reference. So, if you want to use a tuple with many elements and you have to copy this tuple very often then you might use a tuple class. Instead of the new ValueTuple struct. But as described in the previous paragraph, I would not recommend such a software design. Tuples should be used as temporary transport containers within a small context. If you must need a long living instance you should use an own data class instead of a tuple.

The second fact we see within the decompiled code is the naming mechanism of the tuple members. The names of the members are used within the IDE only. They are used to increase readability but they are not part of the resulting source code. So, you don’t have to fear expensive string comparisons if you use named tuple elements.

Tuple as function parameter

A tuple is a normal value type. So, you can pass a tuple as parameter to a function.

static void Main(string[] args)
{
DoSomething((4.2, 1));
}

private static void DoSomething(ValueTuple x)
{
double y = x.Item1 * x.Item2;
}

But should we use a tuple as function parameter just because it is possible from a technical point of view. I don’t think so. From software design point of view, I don’t see a use case for this feature.

Summary

The new tuple introduced with C#7, in combination with the new tuple syntax is a nice and powerful improvement of the C# language. It offers an efficient way to implement methods with several return values.

Werbeanzeigen
Veröffentlicht unter C# | Kommentar hinterlassen

Software Transactional Memory

One of the main challenges in multithreading applications is the access to shared memory. In case you have several software-components which want to read and write to shared memory you must synchronize these data access. The concept of Software Transactional Memory (STM) is based on the idea to use a concurrency control mechanism analogous to database transactions. A transaction in this context is a series of reads and writes to shared memory. These reads and writes are logically connected and should be executed in an atomic way, which means that intermediate states are not visible to other components. The transaction concept is an alternative to lock-based synchronization.

Let’s look at the following example to understand the concept: We have two list which store some data. Now we want to transfer one value from the first list to the second list. At the same time some other parallel processes may access the lists too. These other components may want to read or write some data. The transactional concept will guarantee a consistent state of the data objects. The task which wants to execute the movement of the value from one list into another list will implement execute the delete and insert operations as a transaction. In case another task reads the data at this moment it will only see the lists before the movement or after the movement. It will never see the intermediate state where the data value is deleted in one list but not yet inserted into the other list.

 

STM in C++

Unfortunately, STM is not yet part of standard C++ programming language. It is on the list of possible C++20 features but as there are some other features with higher priority it is not for sure that STM will be part of the standard soon. Nevertheless, the concept is well defined and there exist an experimental implementation of STM in GCC 6.1. Furthermore, there are a lot of third party libraries which will allows to use STM in C++ applications.

Within this article I want to introduce the concept in general and don’t want to show the implementation based on a specific one of these third-party libraries. Therefore, the source code of this article is based on the keyword introduced for the experimental implementation. You will find an introduction to these keywords within the cpp reference.

So please keep in mind that the shown example implementations cannot be compiled with a standard compiler. You can use a compiler which contains an experimental implementation of the STM or you may use a third-party library and adapt the example implementation accordingly.

 

Transactional Concept

A transaction is an action with the following characteristics: Atomicity, Consistency, Isolation and Durability (ACID). The STM in C++ will have these characteristics, except of durability. Based on the example above, I want to explain the remaining three concepts. We want to implement a transaction which contains two statements: delete a value from one list and insert the value into another list.

Atomicity

The two statements will be executed together ore none of them is executed. The transaction covers all the statements and therefore the transaction acts like a single statement.

 

Consistency:

The system is always in a valid state. Either all the statements within a transaction are executed or none of the statements is executed. The system will never be in a state where a part of the statements was executed only.

 

Isolation:

Each transaction is executed independently of other transactions.

 

Execution of a Transaction

As mentioned at the beginning of the article, the transaction concept is an alternative to lock-based synchronization. Therefore, transactions are implement in a way to ensure the characteristics shown above without the need of lock-mechanisms.

A transaction stores its starting state and executes all statements without lock-mechanisms. In case there is a conflict during transaction execution, it will be stopped and a rollback to the starting state is done. Afterwards the transaction will be executed again. If all statements are executed successful, the transaction will check whether the start conditions are still valid, e.g. no one has changed the data in the meantime. If the start conditions are still valid, the transaction will be published.

A transaction is a speculative action. In contrast to a lock-based mechanism, it follows an optimistic approach. The transaction will be executed without any synchronization but will be published only in case the start conditions are still valid. A lock-based mechanism instead follows a pessimistic approach. It ensures the exclusive access to the critical section and waits in case some other task currently holds the access rights to this critical section. The optimistic approach does not need this expensive lock-mechanism but of course, permanent rollbacks due to frequent data collisions may be expensive too. Therefore, it depends on the use case whether STM or lock-based synchronization is faster.

 

Performance

STM may increase the performance in many use cases. Unlike the lock-based techniques used in most modern multithreaded applications, STM follows an optimistic approach. A thread completes modifications to shared memory without regard for what other threads might be doing. It records every read and write within a log and stores the starting conditions to detect possible conflicts. Of course, this kind of data management isn’t for free but it creates less overhead than locking mechanisms. Additional, in case of data conflicts, a rollback is done and the transaction will be repeated. Therefore, the performance of STM is very much related to the likelihood of data conflicts.

The benefit of this optimistic approach is increased concurrency. No thread needs to wait for access to the shared memory. Different threads can safely and simultaneously modify parts of the shared memory which would otherwise be protected under the same lock in lock-based implementations.

 

Implementation

It is very easy to use STM in your implementations. No matter if you use a STM library or the upcoming standard language feature, normally you just must put the operations which belong together in a transaction, by enclose them with an according block (we will see examples later). But of course, you must respect the technical conditions of the STM concept. If you read and write data objects within transactions, you should not access these data objects outside of transactions too. Such data manipulations in not synchronized task can lead to data races. Furthermore, within your transactions, you are limited to transaction-safe functions. As we learned so far, the STM concept works with an optimistic approach and in case of conflicts a transaction rollback is done. Of course, this is possible only in case the functions called within the transaction can be rolled back. For example the std::cout function is not transaction-safe and cannot be used within transactions.

 

Example

Now we would look at an example which uses the experimental STM implementation of C++. Please keep in mind that the following source code will create compiler errors in standard compilers. You must use a compiler which has an implementation of this experimental feature or you may use on of the many third-party libraries for STM and adapt the example accordingly.

Within the example application we would look at typical use cases: change a value and execute some function based on the value. This sounds very easy but has the well-known pitfalls if we change to a multithreaded application. So, let’s say we have two variables. Within a function we want to change the variables values and afterwards we want to read the values and pass them to a function – in this case a standard cout function.

int x, y;

using namespace std::chrono_literals;

void Execute()
{
	x++;
	y--;

	std::cout << "(" << x << "," << y << ")";

	std::this_thread::sleep_for(100ns);
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);
	
	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) Execute(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}
	
	return 0;
}

 

If we execute this application we will see some strange issues. It looks like some of the variable increase and decrease steps get lost as the final values are not like expected and furthermore the output of cout may be muddled up.

These issues happen because there are data races between the different threads. A thread may execute the value increase or decrease at a moment where another thread already executed such a data change. So, the second thread works on partially changed data. The same concurrency issue occurs for the cout function. It may be partially executed by one thread and then interrupted by another thread.

The actual STM concept idea offers two concepts to solve these issues: Synchronized Blocks and Atomic Blocks. Following we will see and compare the two concepts and adapt the example application accordingly.

 

Synchronized Blocks

The STM implementation contains the concept of synchronized blocks. A synchronized block allows to combine different statements within this block. It is executed in a way that behaves like it is secured by one global lock-statement. Maybe it is implemented with a more performant mechanism but it must behave in the same way like a lock.

The example can be changed very easily. We just have to write the statements into a synchronized block. Now the application behaves like expected. The variables will be increased and decreased and the output looks fine.

int x, y;

using namespace std::chrono_literals;

void ExecuteSynchronized()
{
	synchronized{
		x++;
		y--;

		std::cout << "(" << x << "," << y << ")";

		std::this_thread::sleep_for(100ns);
	}
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);

	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) ExecuteSynchronized(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}

	return 0;
}

 

As synchronized blocks behave like as they are synchronized by a global lock, the different blocks will be executed successively. So we have the behavior of a lock-based mechanism and no real STM. STM is based on optimistic locking and transactions and allows a parallel execution of different threads. Synchronized blocks follow a pessimistic lock-based approach and will execute the blocks successively.

This sound like a disadvantage in the first moment but it will allow us to use transaction unsafe code within the synchronized blocks. For example, the “cout” function is transaction unsafe but we can still use it. In my opinion there are two big advantages of synchronized blocks. At first you can change any existing source code which is not thread safe into thread safe code just by enclosing the existing statement by a synchronized block. And second, if you use real STM, like we will see in the next example, you are able to change from real STM to the lock-based synchronization block just by change one statement. This will allow you to test both concept and use the better one according to your use case. Furthermore, you can change from real STM to synchronized blocks in case you want to use a transaction unsafe function.

 

Atomic Blocks

Using atomic blocks is nearly as simple as using synchronized blocks. We just have to add the according block which encloses the critical statements. But additional we must remove the “cout” function now as this function is not transaction safe. So, we have to change the output function and for example move it behind the thread execution or we can write into an buffer during thread execution and output this buffer within a parallel task.

int x, y;

using namespace std::chrono_literals;

void ExecuteAtomic()
{
	atomic_noexcept{
		x++;
		y--;

		std::this_thread::sleep_for(100ns);
	}
}

int main()
{
	x = 0;
	y = 100;

	std::vector threads(5);

	for (auto& thread : threads)
	{
		thread = std::thread([] { for (int i = 0; i < 20; ++i) ExecuteAtomic(); });
	}

	for (auto& thread : threads)
	{
		thread.join();
	}

	return 0;
}

 

Atomic blocks exist in three variations which differ in exception handling:

  • atomic_noexcept: If an exception occurs, std::abort is called and the application will be aborted.
  • atomic_cancel: If a transaction-safe exception occurs, the transaction will be canceled and rolled back and the exception is rethrown.
  • atomic_commit: If a transaction-safe exception occurs, the transaction will be committed and the exception is rethrown.

 

Summary

The Software Transactional Memory is a performant and easy to use concept to solve data races in multithreading applications. In the (near) future STM will be integrated in the C++ standard. In the meantime, you can use existing third party libraries to add STM features.

Veröffentlicht unter C++ | Kommentar hinterlassen

std::atomic

Within multithreading applications even trivial functions like reading and writing values may create issues. The std::atomic template can be used in such situations. Within this article I want to give a short introduction into this topic.

 

Atomic operations

Let’s start with an easy example. Within a loop we will increase a value by using the “++” operator. The loop will be executed by several threads.

int value;

void increase()
{
	for (int i = 0; i < 100000; i++)
	{
		value++;
	}
}

int main()
{
	std::thread thread1(increase);
	std::thread thread2(increase);
	std::thread thread3(increase);
	std::thread thread4(increase);
	std::thread thread5(increase);

	thread1.join();
	thread2.join();
	thread3.join();
	thread4.join();
	thread5.join();

	std::cout << value << '\n';

  return 0;
}

 

We expect an output of “500000” but unfortunately the output is below this value and it will be different on each execution of the application.

That’s because the “++” operation is not atomic. To keep it simple we can say this operation consist of three steps: read value, increase value and write value. If these steps are executed in parallel, several threads may read the same value and increase it by one. Therefore, the result is much lower than expected.

The same issue may occur on simple read and write mechanisms too. For example, thread 1 sets a value and thread 2 reads a value. In case the value type does not fit into a processor word, even such trivial read and write operations are not atomics. Accessing a 64bit variable in a 32bit system may result in incomplete values.

In such cases we need a synchronization mechanism to prevent the parallel execution. Of course, we could use standard lock mechanism. Or we could use the std::atomic template. This template defines an atomic type and guarantees atomicity of the operations of this type. To adapt the previous example, we just have to change the data type.

std::atomic<int> value;

void increase()
{
	for (int i = 0; i < 100000; i++)
	{
		value++;
	}
}

int main()
{
	std::thread thread1(increase);
	std::thread thread2(increase);
	std::thread thread3(increase);
	std::thread thread4(increase);
	std::thread thread5(increase);

	thread1.join();
	thread2.join();
	thread3.join();
	thread4.join();
	thread5.join();

	std::cout << value << '\n';

	return 0;
}

 

Now the result is like expected. The output of the application is “500000”.

 

std::atomic

As mentioned before, the std::atomic template defines a atomic type. If one thread writes to an atomic object while another thread reads from it, the behavior is well-defined. The operation itself has a guaranteed atomicity. Furthermore, read and writes to several different objects can have a sequential consistency guarantee. This guarantee is based on the selected memory model. We will see according examples later.

 

Read and write values

A common scenario in multithreading applications is the parallel read and write of variables by different threads. Let’s create a simple example with two threads, one is writing variable values and the other one reads these values.

int x;
int y;

void write()
{
	x = 10;
	y = 20;
}

void read()
{
	std::cout << y << '\n';
	std::cout << x << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

  return 0;
}

 

Within this kind of implementation, the behavior is undefined. Depending on the order of the threads the output may be “20/10”, “0/0”, “0/10” or “20/0”. But beside these expected results it may happen that a read is done during a write and a therefore an incomplete written value is used. Within the example this should not happen but ex described before depending on the used processor and value types it may happen. Therefore, we can say that the behavior of the application is undefined.

By using the std::atomic template we could change the undefined behavior to a defined on. We just have to define the variables as atomic and use the according read and write functions.

std::atomic<int> x;
std::atomic<int> y;

void write()
{
	x.store(10);
	y.store(20);
}

void read()
{
	std::cout << y.load() << '\n';
	std::cout << x.load() << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

  return 0;
}

 

Now, the behavior of the application is well defined. Independent from the used processor and independent from the data type (we could change from “int” to any other type), we have a defined set of results. Depending on the order of the threads the output may be “20/10”, “0/0”, “0/10”. The output “20/0” is not possible because the default mode for atomic loads and stores enforces sequential consistency. That means, within this example, x will always be written before y is changed. Therefore, the output “0/10” is possible but not “20/0”. As the std::atomic template ensures an atomic execution of the read and write functions we don’t have to fear undefined behavior due to incomplete data updates. So, we have a defined behavior with three possible results.

 

Memory model

As mentioned before, the selected memory model will change the behavior of atomic read and write functions. By default, the memory model ensures a sequential consistence. This may be expensive because the compiler is likely to emit memory barriers between every access. If your application or algorithm does not need this sequential consistency you can set a more relaxed memory model.

For example, if it is fine to get “20/0” as result within our application, we can set the memory model to “memory_order_relaxed”. This removes the synchronization and ordering constraints, but operation’s atomicity is still guaranteed.

void write()
{
	x.store(10, std::memory_order_relaxed);
	y.store(20, std::memory_order_relaxed);
}

void read()
{
	std::cout << y.load(std::memory_order_relaxed) << '\n';
	std::cout << x.load(std::memory_order_relaxed) << '\n';
}

 

Another interesting memory model is the Release-Acquire ordering. You can set the store functions within the first thread to “memory_order_release” and the load functions in the second thread to “memory_order_acquire”. In this case all memory writes (non-atomic and relaxed atomic) that happened before the atomic store in thread 1 are executed before load in thread 2 is executed. This takes us back to the ordered loads and stores, so “20/0” is no longer a possible output. But it does so with minimal overhead and an increased execution performance. Within this trivial example, the result is the same as with the full-blown sequential consistency. In a more complex example with several threads reading and writing all or some of the variables, the result may be different from the default sequential consistency.

void write()
{
	x.store(10, std::memory_order_release);
	y.store(20, std::memory_order_release);
}

void read()
{
	std::cout << y.load(std::memory_order_acquire) << '\n';
	std::cout << x.load(std::memory_order_acquire) << '\n';
}

 

As mentioned before, the Release-Acquire ordering ensures that all memory writes before the atomic store are executed, even non-atomic ones. So we could change the application and use a non-atomic type for x. The atomic store of y still ensures the correct write order.

int x;
std::atomic<int> y;

void write()
{
	x = 10;
	y.store(20, std::memory_order_release);
}

void read()
{
	std::cout << y.load(std::memory_order_acquire) << '\n';
	std::cout << x << '\n';
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

	return 0;
}

 

But again, in a more complex scenario with several threads and maybe a read of x only, it could be necessary to define the x as atomic.

 

Synchronization of threads

A common implementation technique for thread synchronization are locks. By using the atomic template, you can create such thread synchronizations without locks. Depending on the selected memory model this may increase the performance of your application.

The following application shows an example how to use the atomic template to synchronize the execution of two threads. Thread 2 will wait until thread 1 finished execution. This is done by using a variable which contains the execution state. Furthermore thread 1 commits its results within a data variable. The used Release-Acquire ordering mechanism of the atomic template ensures that the data is written before the synchronization flag is set.

int data;
std::atomic<bool> ready;

void write()
{
	data = 10;
	ready.store(true, std::memory_order_release);
}

void read()
{
	if(ready.load(std::memory_order_acquire))
	{ 	
		std::cout << data << '\n';
	}
}

int main()
{
	std::thread thread1(write);
	std::thread thread2(read);

	thread1.join();
	thread2.join();

	return 0;
}

 

Summary

This article gave a short introduction into the std::atomic template based on some common use cases. The examples name some major issues regarding data access in multithreading applications and introduce according implementations based on the atomic template. Of course this article is an introduction only. The atomic template offers some more features, for example there exist more memory model configurations beside the three shown within this article.

Veröffentlicht unter C++ | Kommentar hinterlassen

Deconstruction in C# 7

C# 7 introduces a nice syntax to deconstruct a class and access all members. If you want to support deconstruction for an object you just must write a “Deconstruct” method which contains one or more out parameter which are used to assign class properties to variables. This deconstruction is also supported by the new Tuple struct introduced with C# 7. The following example shows the deconstruction feature used for a tuple and for an own class.

static void Main(string[] args)
{
  (int x, int y) = GetTuple();

  (double width, double height) = GetRectangle();

  Rectangle rect = new Rectangle(1, 2);
  (double a, double b) = rect;
}

private static (int x, int y) GetTuple()
{
  return (1, 2);
}

private static Rectangle GetRectangle()
{
  return new Rectangle(5, 8);
}

class Rectangle
{
  public Rectangle(double width, double height)
  {
    Width = width;
    Height = height;
  }

  public double Width { get; }
  public double Height { get; }

  public void Deconstruct(out double width, out double height)
  {
    width = Width;
    height = Height;
  }
}

 

The assignment of the class properties to the variables is done by the position. So, the variable names and the class property names or tuple element names must not match. Furthermore, it is possible to write several “Deconstruct” methods with different number of elements.

In the first moment this looks like a nice feature which increases the readability of the source code. But it comes with a big issue. As the deconstruction is done be the element positions only it is a source for errors. If you switch elements by mistake, you create an issue which cannot be found by the compiler if the types are equal. This may cost you a lot of debugging time because such a mistake isn’t easy to find. The following code will show this disadvantage. Both lines of code are valid but the second one will result in runtime errors as you mixed up the parameters.

static void Main(string[] args)
{
  (double width, double height) = GetRectangle();

  (double height, double width) = GetRectangle();
}

 

Furthermore, you can get in trouble in case someone changes a class you are using. For example, several class properties get changed as result of a refactoring. So, their names and meanings will be changed. If you use this class within you code and access the class properties your compiler will show you according error messages as the property names have changed. But if you use deconstruction and the number and type of parameters haven’t changed you will get in trouble. Names and meanings of class elements have changed but the deconstruction method does only consider number and type of parameters. So, neither the compiler nor you will suspect any issues if you use the refactored class. But during runtime of your application you can expect some unpleasant surprises followed by hours of debugging and bug fixing.

Let’s look at the example with the rectangle. The developer of this class has done a refactoring and now the rectangle is represented as vector with an angle and a length. But unfortunately, the Deconstruction interface hasn’t changed and so you do not get any compilation errors.

static void Main(string[] args)
{
  (double width, double height) = GetRectangle();
}

private static Rectangle GetRectangle()
{
  return new Rectangle(5, 8);
}

class Rectangle
{
  public Rectangle(double vectorAngle, double vectorLength)
  {
    VectorAngle = vectorAngle;
    VectorLength = vectorLength;
  }

  public double VectorAngle { get; }
  public double VectorLength { get; }

  public void Deconstruct(out double vectorAngle, out double vectorLength)
  {
    vectorAngle = VectorAngle;
    vectorLength = VectorLength;
  }
}

 

In my opinion the deconstruction feature comes with more disadvantages then advantages. The advantage of a little bit shortened source code is way too small compared to the disadvantage of possible hard to find implementation Errors.

Veröffentlicht unter C# | 1 Kommentar

Local or nested functions in C# 7

With C# 7.0 it is possible to create functions nested within other functions. This feature is called “local functions”. The following source code shows an according example.

static void Main(string[] args)
{
  int x = DoSomething();

  //int y = Calc(1, 2, 3);  // compiler error because Calc does not exist within this context
}

private static int DoSomething()
{
  int shared = 15;

  int x = Calc(1, 2, 3);

  return x;

  int Calc(int a, int b, int c)
  {
    return a * b + c + shared;
  }
}

 

The function “Calc” is nested within the function “DoSomething”. As a result, “Calc” is available in the function scope only and cannot be used outside of “DoSomething”. Inside the enclosing function, you can place the nested function wherever you want. Furthermore, the nested function has access to the variables of the enclosed function.

In my opinion local functions are a nice feature. A common software design concept is to hide internal implementation details of software components by limiting the scope of internal components. If you have functions which are needed inside a class only, you can limit the scope to this class by defining a private function. But so far this was the lowest possible scope. If you have created sub-functions which were needed by a single function only, you had to publish these functions on class scope too, or use other techniques like lambda functions. With C# 7 you are now able to limit the scope of such nested functions to the scope of the enclosed function.

I like local functions because you are now able to define a proper scope. But there are two features which I don’t like because they may result in source code which could be difficult to understand. One of these two features is the possibility to place the nested functions wherever you want. In my opinion you should place them always at the beginning or at the end. I prefer the end, after the return statement. This will increase the readability of the source code because at first you have the code of the main function itself and then the code of the internal sub-functions. So, you create a clear separation between these different concerns and you have reading order from the enclosed main function to the nested local functions. The second feature which I don’t like is the possibility to access variables of the enclosed function. Based on a technical point of view this feature is fine, because the nested function is part of the enclosing functions scope and therefore can access elements of this scope. But from a software design point of view it is a questionable feature because you can create a lot of dependencies between the enclosed function and several local functions. This can result in difficult code with a spaghetti like data flow. An alternative is to pass variables as parameters to the local functions and create a clear separation between the different functions and their concerns. In my opinion you should therefore use this feature very wisely and only if it really comes along with some advantages.

A typical use case for local functions are recursive function calls. Often you will have a public main function which should do some calculation. This function may now use recursion function calls to execute the calculation. So, you will create an internal function used for these recursive calls and this internal function often contains parameters with interim results or recursion status information. Therefore, such a function should never be part of the public interface. On the contrary, it may only be used meaningful within the scope of the main function. Now you will have the possibility to implement it according to this need and create a local function. The following example shows an according use case. Within a tree structure we will get the maximum depth of the tree.

static void Main(string[] args)
{
  TreeElement tree = new TreeElement();
  TreeElement sub1a = new TreeElement();
  TreeElement sub1b = new TreeElement();
  TreeElement sub2 = new TreeElement();

  sub1b.SubElements = new List();
  sub1b.SubElements.Add(sub2);

  tree.SubElements = new List();
  tree.SubElements.Add(sub1a);
  tree.SubElements.Add(sub1b);

  int depth = GetDepth(tree);
  Console.WriteLine("Depth: " + depth);
}

public class TreeElement
{
  public List SubElements;
}

private static int GetDepth(TreeElement tree)
{
  return GetDepth(tree, 1);

  int GetDepth(TreeElement subTree, int actualDepth)
  {
    int maxDepth = actualDepth;

    if (subTree.SubElements == null)
    {
      return actualDepth;
    }

    foreach (TreeElement element in subTree.SubElements)
    {
      int elementDepth = GetDepth(element, actualDepth + 1);
      if (maxDepth < elementDepth)
      {
        maxDepth = elementDepth;
      }
    }

    return maxDepth;
  }
}

 

The calculation of the depth is done by recursive function calls which step into the sub elements. If a tree leaf is reached, the recursive function will return its current depth. Therefore, the actual depth of the sub-tree is passed as parameter to the recursive function. Of course, you could use the recursive function in the public interface too but in this case the user of the interface will see the depth parameter too which is needed internally only. You could set the default value of this parameter to “1” and explain it within a documentation but nevertheless you interface gets dirty as it contains parameters which are necessary due to internal needs only. By using a local function, you can offer a clean interface and hide the internal implementation details. Furthermore, this internal helper function is visible within the containing function scope only and will therefore neither be visible in the public interface nor in the private interface of the surrounding class.

Veröffentlicht unter C# | Kommentar hinterlassen

The Visitor Pattern – part 9: summary

This article is one part of a series about the Visitor pattern. Please read the previous articles to get an overview about the pattern and to get the needed basics to understand this article.

As this article is the last one of the series I want to give a short retrospective and summarize the advantages and disadvantages of the Visitor pattern.

 

Retrospective

The article series started with a short introduction of the Visitor pattern, shows its strength but also explained the reasons for existing confusions and misunderstandings regarding this pattern. To remove these misunderstandings, we started with an investigation regarding the technical needs for this pattern and explored the difference of single dispatch and double dispatch.

Based on the technical needs we could derive the aims of the pattern. So, the article series continued with explanations and examples for the two main needs: enumeration and dispatching. A further article combined these two aspects to create a wholesome Visitor pattern example.

Based on this Visitor pattern example we investigated some additional questions often seen in practical application of the pattern. We created an extended enumerator Visitor with access to container elements which are not part of the visible objects and we analyzed the pros and cons of a monolithic single Visitor versus small specialized Visitors.

At next there followed articles about topics regarding the code quality. We learned how to create reusable enumerator and query visitors following the separation of concerns rule. And we have seen some examples which can lead to over engineering if we use the Visitor pattern.
 

Based on these articles I want to summarize the advantages and disadvantages of the pattern. Additional, as the pattern is a good candidate for over engineering, I want to name some use cases for alternative patterns.

Within the articles of this series we have seen some different implementations of the pattern. They all following the same main principles but have some differences in implementation according the practical needs of the respective use cases. Therefore, I don’t want to finish the article with a default implementation of the pattern as there is no singe default implementation. Instead I want to summarize the implementation aspect by list the involved components, the concepts to connect these components and I want to give some recommendations according the naming of the Visitor interfaces and methods.

 

Advantages

In object-oriented implementations, data classes often inherit from base classes or interfaces. Data containers which hold collections of these data classes often store pointers to the base classes. This object-oriented concept works fine as long as no type specific methods are needed. In case the client wants to call type specific methods he must cast the base class pointer to the specific type. This may result in code which is hard to understand, hard to maintain and has a good chance for errors.

The Visitor pattern resolves this issue. It allows a type safe access to the origin data type without the need of casting. Furthermore, the Visitor pattern offers traversing mechanism. As result, the pattern allows an easy access of data elements independent whether they are stored in a simple list or a complex container and independent whether they are stored in origin type or as pointers to any base type.

 

Disadvantages

It isn’t easy to find the right balance between many small specialized Visitor interfaces and a few or a single monolithic wide Visitor interface. And both approaches came with drawbacks. Many small interfaces increasing the complexity of the system and few wide interfaces lead to implementations containing empty method implementations.

The Visitor pattern traverse elements of a data container. This traversing is done via double dispatch. As a result of this dispatching, the traversing is some kind of hidden feature. A client implements the dispatching callbacks but he may easily overlock that the callback is one step of a traversing over a data container. On one hand that’s fine because the client should not have to think about the data structures, but on the other hand this may cause issues. Like in any other traversing mechanism, the client should not change the data container during traversing. So, the dispatching callback methods of the client should not change the data container itself. Furthermore, a traversing step should normally not execute long running complex functionalities. In summary, the traversing is some kind of hidden feature for a client developer, but it is an important software design fact which has to be respected by the client developer. This contradiction may result in bad software designs, especially if the developer isn’t that familiar with the Visitor pattern.

 

Alternatives

The Visitor pattern is a powerful but complex pattern. It unites two functionalities: type conversation by dispatching and traversing of complex data structures. But if only one of these functionalities is needed, you will find more specialized patterns which are less complex. If you want to traverse over a data structure only, you should implement an Iterator pattern. And if you want to do the dispatching only, you may use Callback mechanisms.

 

Components and Dependencies

The Visitor pattern contains the following components.

  • Data Elements
  • Data Container
  • Enumerator
  • Algorithm

The Data elements are the Visitable components and the Enumerator and Algorithm together are the Visitor component which execute some functionality based on the data elements. As explained and shown within the article series, I prefer a clear separation of concerns. Therefore, I recommend implementing the Algorithm aspect and the Enumeration aspect in different and independent components. Different Algorithms and different Enumeration can be loosely coupled by composition according the use case specific needs. A strong coupling by using inheritance should only be used in case there is no need for a reuse of any of the components.

 

Naming

At the very beginning of the article series I mentioned that, in my opinion, the Visitor pattern is one of the most misused and misunderstand patterns and the main source of this issue is the misleading naming of the pattern interfaces and methods.

Names of methods, interfaces and components should reflect their purpose. Therefore, I recommend avoiding the naming which is used within the origin Visitor pattern as this naming is meaningless. Following I would give you some naming I like to use. But of course, feel free to find your own naming.

As the name Visitor is well known and most software developers know the pattern, you should use this name anyway, even if it is universal and therefore some kind of meaningless. You should use the name but extend it with a more describing naming. As mentioned before, we have a separation of concerns and we may create specialized visitors. Therefore, we may create visitors for following purposes: enumeration, queries, data updates and so on. So, you should not implement a “IVisitor” interface or a “Visitor” component, but specialized ones like a “CustomerEnumerationVisitor” or a “OrderQueryVisitor”. Some may argue that implementation details should not part of naming. That’s totally fine, but as the Visitor contains the hidden enumeration feature it may be very useful for a client developer to know that he uses a component which is implemented as Visitor. Therefore, in this case it is fine to add the “Visitor” prefix to the component name.

The data elements will be passed to the Visitors. Therefore, I would call them “Visitable”. This results in according interface names like “IVisitableCustomer” or “IVisitableOrder”.

But what’s with the method names? The origin method names are “accept” and “visit”. To be honest, I think these names are terrible. They don’t say anything about their purpose. So, let’s think about the purpose of the methods and find some better names. The Visitor pattern is implemented by double dispatch. For this purpose, the data elements offer a method to get the instance of the data element. This getter method will pass the element instance to an instance receiver method which is provided by the Visitor. The method names should reflect this dispatching process. So, let’s call the method to get the element instance “GetInstance” and call the method which receives these instance “InstanceReceiver”.

By using such a clear naming, you could avoid some of the misunderstanding and errors which are a result of the complexity of the Visitor pattern.

 

Conclusion

The Visitor pattern is one of the base implementation patterns in software development. So, it should be part of the tool kit of a good software engineer. But unfortunately, the Visitor Pattern comes with a big disadvantage: it is one of the most misunderstood and misused patterns I know. This article series provided an extensive overview about the Visitor pattern and give you the needed knowledge to use the Visitor pattern within your applications.

Veröffentlicht unter C++ | Kommentar hinterlassen

The Visitor Pattern – part 8: over engineering

This article is one part of a series about the Visitor pattern. Please read the previous articles to get an overview about the pattern and to get the needed basics to understand this article.

 

Motivation

Normally, I would like to limit myself to explain the use cases of a pattern and don’t waste time to write about the cases where you should not use this pattern. But if you read about the Visitor pattern you will find a lot of statements about the purpose and advantages of the Visitor pattern which are not directly related to the pattern. This may result in implementations using the Visitor pattern instead of some other pattern which is more suitable for the specific situation. Such kind of over engineering comes with a higher complexity of the code and therefore creates more effort and is more error prone. Within this article I want to mention some of the statements you will find about the Visitor pattern, analyze them and may find better alternatives.

 

Extensibility of the elements

A lot of articles about the Visitor pattern contain the following statement: “The Visitor pattern is used to extend the functionality of data elements without changing these data elements”.

Of course, this statement isn’t wrong. But this is a side effect of the pattern only. If you want to get this feature, you don’t have to use the Visitor pattern at all. There are better alternatives.

By using the Visitor pattern, you pass the element instance to a dispatcher which will use the data to execute some query. If you want to add some new functionality, you can create a new query and you don’t have to extend the data element. But think about an implementation without the Visitor pattern. Will you implement the data query within the data class in this case? Of course not. According to the “separation of concerns” principle you create a separate query class. This class will use the data element or data structure to execute its functionality. Therefore, if you follow the base software design concepts, like “separation of concerns”, you can extend the functionality without changing the data elements. Using the Visitor pattern is over engineering in this case.

 

Extensibility of the queries

In connection with the previous statement it is also mentioned that the Visitor allows the add new queries without changing the actual implementation. Again, this statement is true but a side effect only. The separation between data elements, data structures, enumerators and queries is a base concept of the object-oriented paradigm and therefore it is a base concept in object-oriented programming languages. No additional or special implementation pattern is needed. Again, the Visitor pattern is over engineering in case the separation is the only reason using it.

 

Use case specific Visitable methods in data elements

So far, we have implemented several examples with different visitors and visitable elements. But in all examples, we have used the same base interface: the functions “GetInstance” and “InstanceReceiver” do not have a return value and the only function parameter is the object instance.

Now we want to think about the question whether it is advantageous to break this strict concept and offer use case specific functions. For example, a Visitor can be used to validate all data elements. So, we could extend the “InstanceReceiver” function and add a new parameter which contains validation information. Or we can even define a return value for this method and the element changes its internal state according the return value, for example set a validation flag.

At first this use case specific interfaces may sound fine as they fulfill the specific needs. But I would not recommend such a specialization of the interface as it comes with big disadvantages. The main purpose of the visitor pattern is the dispatching of elements. This is a very common functionality which can be used in a lot of situations. If we now mix in a use case specific interface, we limit the visitor to exactly this single case and therefore we reduce maintainability and extensibility of the implementation. Furthermore, the strength of the Visitor interface is based on the strict separation of concerns. The involved object instances are very tightly coupled. If we extend the interface and add use case specific parameters and return values, we create a strong relationship between the elements and again reduce maintainability and extensibility. Furthermore, we add some “hidden” functionality. Within the example above we added a return value to set some data within the date element after the Visitor executes something. Such object changes should be done by using setter functions or properties but nor by evaluating a return value of a function which should get the object instance. Therefore, I call it “hidden” functionality or “side effect” of the method as something is done which does not correspond with the main purpose of the method. Such side effects will reduce maintainability and are a good source for errors.

 

High effort in case a new data element is added

One disadvantage often mentioned about the Visitor pattern is the high effort resulting on a change of the data base. If you add a new data element you must add in in the Visitor interface(s) and of course adapt all visitors which implement the interface. But is this a real disadvantage of the pattern? I think no, it isn’t. Maybe it is even an advantage.

If you have a list of elements with type specific element interfaces and you want to evaluate or change the elements, you always must traverse about the list and implement type specific functionality. Independent of the used pattern or way of implementation, you must adapt or extend this implementation in case a new element type is added or an existing one is removed. So, this is a use case dependent need and not a pattern specific disadvantage.

If you use the Visitor pattern you have the advantage of a central Visitor interface. You can add the new element type to this interface, by adding a new “InstanceReceiver” function, and the compiler will show you all source code elements – all visitors – which must be adapted. If you use other implementations, like switch cases in combination with type casts, you may have to find all code elements by yourself, which is a very error prone process.

As conclusion, this often-mentioned disadvantage of the Visitor pattern isn’t valid because it is a use case specific need and not a pattern specific result. On the contrary, you can say the Visitor pattern has the advantage to support this use case in a very easy way. You just have to change the according Visitor interface and the compiler will do the critical work and find all code elements you have to change.

 

Outlook

The next article will finish the series with a summary of the whole Topic.

Veröffentlicht unter C++ | Kommentar hinterlassen