Within this
article I want to give a short overview over the new Tuple features in C# 7. So
far, we already had a Tuple class in C#.
But with the new one some technical details have changed and some
syntactical sugar was added. This new tuple syntax makes the source code more
readable and perfectly integrates the technical advantages of tuples into the
language. But of course, from a software design point of view tuples do have
some common disadvantages too which are still present. Therefore, within this
article I want to introduce the new tuple, the new syntax, show the technical
background, mention the pros and cons and give hints when to use and when to avoid
tuples.
Tuple syntax
Let’s start
with a base example. We want to implement a function which may fail for several
reasons. As we expect such failures the incomplete execution is a normal case
which we want to evaluate and do according actions like repeat the method call.
So, we don’t want to throw exceptions but return an according error code.
Beside this error code the function returns a result. So, the function has two
outputs: the function result and the error code as additional execution
information. A common design pattern is to use an out parameter for the
additional information.
static void Main(string[] args)
{
int errorCode;
var x = DoSomething(out errorCode);
if (errorCode == 0)
{
double y = x * 5;
}
}
private static double DoSomething(out int errorCode)
{
errorCode = 0;
return 4.2;
}
This common
design pattern has one major disadvantage: it creates a complex data flow. The
standard design of a function has a straight forward data flow: you have one or
more input parameters which are passed on method call and you have one function
result which will be assigned to a result variable. With output parameters you
create a data flow which is way more complex. Would it be nice to return all
outputs as function result to go back to the straight forward data flow? This
can be done with tuples. The following source code shows the adapted example.
static void Main(string[] args)
{
var x = DoSomething();
if (x.Item1 == 0)
{
double y = x.Item2 * 5;
}
}
private static (double, int) DoSomething()
{
return (4.2, 0);
}
Within this
source code the new tuple syntax is used so we don’t have to explicitly define
a tuple instance. On function declaration we can define several return
parameters and the compiler will create an according tuple. If we call the
function we can assign the result to a variable and access the tuple parameters
by the parameters “Item1” to “ItemX”.
This is a
nice first step into the right direction. But it really bothers to call the
tuple members by the generic properties with name “ItemX”. Of course, the new
tuple syntax takes care of this aspect and allows to name the return values.
static void Main(string[] args)
{
var x = DoSomething();
if (x.errorCode == 0)
{
double y = x.result * 5;
}
}
private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}
These named
properties are a huge improvement for the code quality and will greatly
increase the readability. But there is still one aspect which bothers me. On
function declaration I don’t have to implement a tuple but just write down a
list of returns. But on function call I must rethink and use a tuple now which
holds the returns as properties. Would it be nice to use a value list on
function call too? Of course, it would and fortunately this is supported by the
new syntax.
static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();
if (errorCode == 0)
{
double y = result * 5;
}
}
private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}
This final
version of the function implementation shows the big strength of the new
syntax. We can implement a function with several return values and a straight
forward data flow. Furthermore, the readability of the source code is highly
increased as the technical details behind this concept are hidden and the
source code is focused on functionality only.
Deconstruction of Tuples
As we have seen, on function call we can assign the several function return values to directly to according variables. This is possible because the returned tuple will be deconstructed and the tuple properties get assigned to the variables. Within a previous article you can find an introduction into the deconstruction feature which was also introduced with C#
As
mentioned in the linked article, this deconstruction comes with a big issue. You
can easily mix up return values if they have the same type. Within this example
the compiler will show an according error if you switch the variables because
they have different types. But with same types you may run into this issue.
Later on, we want to think about the suitable use cases for tuples and we will
evaluate this issue within the context of these use cases.
Discard of Tuple parameters
C# 7 allows to discard return values and out parameters. The underscore character is used as wildcard for these not required values.
The
following source code shows how to use the discard character to ignore one of
the return values.
static void Main(string[] args)
{
(double result, _) = DoSomething();
double y = result * 5;
}
private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}
As
mentioned within the linked article you should use this feature in rare cases
only. To ignore a function result indicates that there is some software design
issue within the function interface or in the is something wrong in the code
which uses the interface.
When to use a tuple
As we have
seen so far, tuples come with some pros and cons. So, we should not use them
thoughtless. There are many situations were tuples are not the best choice. In
my opinion there are only a few use cases were tuples should be used.
Sometimes
you want to return more than one value from a method. This is a common use case
but in a good interface design it should be a rare case too. Functions should
do one thing only. Therefore, they have one result only. But sometimes
additional to the result – what the function has done – it may be necessary to
return execution information – how the method has been executed. These are for
example error codes, performance measurement data or statistical data. This execution
information contains very technical data about internal operations. With such
an interface the component will become a gray box as it shows internal details
and expect the right reactions of the user according this internal information.
As detail
hiding is one of the base object-oriented concepts, such interfaces should be
avoided. You want to have easy to use interfaces and your component should be a
black box. In my opinion this is mandatory for high level interfaces.
On low level
interfaces you may have a slightly different situation and explicitly want to
get detail information about the component you use. You want to use it as gray
box or even as white box. Furthermore, for performance reasons you may break
some object-oriented rules or even mix object-oriented paradigms with
functional and procedural paradigms. In such low-level interfaces, it is very
common to have methods with several return values. But even in these cases I recommend
having one result only – what the function has done – and one or more internal
information about how it was done.
The options
so far to implement several returns were less than optimal.
- Out
parameters: use is clunky and creates a method interface with a complicated
data flow as in and out streams are mixed up
- Custom-built
transport type for every method: a lot of code overhead for a type which is
used for one method only as temporarily object to group a few values
- Anonymous
types: high performance overhead and no static type checking
- System.Tuple:
best choice so far but with the need to allocate an object and with the
disadvantage of code which needs comments to be easily readable (you must
explain the tuple parameters)
With C# 7.0
we are now able to use the new tuple with the nice implementation syntax inspired
by functional programming. It will make out parameters obsolete. This new tuple
isn’t syntactical sugar only (compared with existing Tuple class) but also an improvement
from technical side. We will analyze these technical details later.
You are now
able to bundle up the different return parameters into one tuple element and
use it as return value. You create a loose coupling between these values. Furthermore
the coupling should exist in a small context only. The tuple is a temporary
object with a short lifetime. A common pattern would be to construct, return
and immediately deconstruct tuples.
In summary
I would give following guidelines how to use tuples:
Methods in
high level interfaces should return the result only. As they will hide internal
details there is no need for additional output. Low level methods may return
additional execution information. In this situation a tuple can be used. The
return value and the execution information can be coupled within the tuple.
This loose coupling exists for data transfer only as the tuple has a short
lifetime and will be constructed, returned and immediately deconstructed.
As we have
already seen at the beginning of the article, the biggest disadvantage of
tuples is the risk to mix up parameters and create hard to find errors. If we
use tuples in low level interfaces only, we can weaken this disadvantage. Low
level components are gray or white boxes and the user has a high detail
knowledge about the internals of such components. This reduces the risk to mix
up parameters and it increases the chance to find such errors quickly. So, if
we use tuples in these use cases only, the advantage of the good readability of
the source code exceeds the disadvantage of the risk to mix up parameters.
Internals of tuple
As seen so
far, the new tuple comes with a nice syntax. Of course, this is just
syntactical sugar which helps to write clean code. So, we want to have a look
behind the syntax and see what is done in the background. Let’s use the example
with the named tuple elements and with deconstruction of the tuple.
static void Main(string[] args)
{
(double result, int errorCode) = DoSomething();
if (errorCode == 0)
{
double y = result * 5;
}
}
private static (double result, int errorCode) DoSomething()
{
return (4.2, 0);
}
If we
disassemble the created application we will see the following code
(disassembling with jetBrains dotPeek).
private static void Main(string[] args)
{
ValueTuple valueTuple = Program.DoSomething();
double num1 = valueTuple.Item1;
if (valueTuple.Item2 != 0)
return;
double num2 = num1 * 5.0;
}
[return: TupleElementNames(new string[] { "result", "errorCode" })]
private static ValueTuple DoSomething()
{
return new ValueTuple(4.2, 0);
}
This code
shows some very important technical details. The used tuple is of the new type
“ValueTuple” and not the longer exiting “Tuple”. “ValueTuple” is a struct, it
is a value type and it has member values. “Tuple” instead is a class, a
reference type and has properties. The “ValueTuple” is a lightweight type which
comes with a better performance in nearly all cases. One exception is an
assignment copy of a struct. As the whole content of the struct must be copied
it is more expensive than a class copy which just copies the object reference. So,
if you want to use a tuple with many elements and you have to copy this tuple
very often then you might use a tuple class. Instead of the new ValueTuple
struct. But as described in the previous paragraph, I would not recommend such
a software design. Tuples should be used as temporary transport containers
within a small context. If you must need a long living instance you should use
an own data class instead of a tuple.
The second
fact we see within the decompiled code is the naming mechanism of the tuple
members. The names of the members are used within the IDE only. They are used
to increase readability but they are not part of the resulting source code. So,
you don’t have to fear expensive string comparisons if you use named tuple
elements.
Tuple as function parameter
A tuple is
a normal value type. So, you can pass a tuple as parameter to a function.
static void Main(string[] args)
{
DoSomething((4.2, 1));
}
private static void DoSomething(ValueTuple x)
{
double y = x.Item1 * x.Item2;
}
But should
we use a tuple as function parameter just because it is possible from a
technical point of view. I don’t think so. From software design point of view,
I don’t see a use case for this feature.
Summary
The new
tuple introduced with C#7, in combination with the new tuple syntax is a nice
and powerful improvement of the C# language. It offers an efficient way to
implement methods with several return values.