In the source code of almost any software application individual strings are concatenated into one string. To implement a string concatenation is thus something absolutely rudimentary and should belong to the repertoire of every software developer. In C# there are several options available to connect strings. And according there exist many opinions about which is the right or best way to concatenate strings. In this context I have already often seen that an experienced programmer has held a moral lecture to a C# beginners about to use a StringBuilder instead of the + operator. But is this really the best solution?
In this article I would like to discuss the different ways to concatenate strings. I explain you the basic differences, give indications when to use which possibility and show you a small performance comparison.
+ Operator
String variables are immutable. Concatenation using the + operator therefore always creating a new string. Consider, for example, a concatenation of string A, B and C in the form “A = A + B + C”. Here, B and C cannot be simply attached to A. Rather, new memory is reserved in the total length of the two values A and B and the contents of the A and B values will be copied into the new memory. The same procedure is then performed to concatenate this intermediate result with the string C. To concatenate many strings therefore results in many temporary objects which consume memory unnecessarily. Furthermore, this variant is relatively slow due to the many copy operations.
String.Concat
Using the String.Concat function also allows to concatenate strings. To do so you may pass all strings as function parameters. The above example can therefore change as follow: “A String.Concat = (A, B, C)”. During the function execution, first the total length of the new string is determined by the lengths of all values passed. Then memory is allocated only once and the strings are copied to the memory. The functionality is thus similar to that of the + operator, with the crucial difference that the many intermediate variables are eliminated and new memory is allocated only once.
StringBuilder
The StringBuilder class manages memory to accommodate strings in it. For this purpose, the class forms a large char array. This is gradually filled with the passed values. The initial size of the array can be specified. When adding strings, the array is automatically enlarged if necessary. The StringBuilder class has just like the String.Concat function the advantage that only single memory must be reserved and it can be used for multiple string concatenations. But the resulting gain in performance may be reduced by the necessary internal administrative operations. This can even result in a decrease of the execution speed.
StringBuilder and String.Concat differ in one crucial point. When the String.Concat function is used, all strings to concatenate must be given as parameters. By using the StringBuilder class, these values can be added step by step by calling the add function. So this is also the main use case of the class. StringBuilder should only be used when multiple values must be connected, but cannot be specified in a single function call. Otherwise, the String.Concat function is usually preferable.
Performance comparison with fixed values
After the theory we want we execute a little practical test now, in order to test the performance of the individual functions. In a first test, four fixed strings are concatenated. This is done 10 million times. The execution time is measured with the Stopwatch class. The code below shows a shortened version of the test program. For readability the functions for time measurement and result outputs are removed.
1 int i;
2 int count = 10000000;
3 string value = null;
4 StringBuilder builder = null;
5
6 for (i = 0; i < count; i++)
7 {
8 value = "Hello" + " " + "World" + "!";
9 }
10
11 for (i = 0; i < count; i++)
12 {
13 builder = new StringBuilder(25);
14
15 builder.Append("Hello");
16 builder.Append(" ");
17 builder.Append("World" );
18 builder.Append("!");
19
20 value = builder.ToString();
21 }
22
23 for (i = 0; i < count; i++)
24 {
25 value = string.Concat("Hello"," ","World","!");
26 }
Please remember again the theoretical information from the preceding sections. What order of the functions, with respect to execution speed, do you expect in this test?
On the basis of the few copy operations I would expect the String.Concat function has the fastest execution speed. StringBuilder should be slower because of the higher administrative overhead in the class. The + operator is likely to end up in midfield or on the last place. Depending whether the management overhead or the less copy operations will have a larger effect in the StringBuilder class.
The following measurements where done during the program execution (mean value of 10 measurements):
- + Operator: 0.022 seconds
- StringBuilder: 0.697 seconds
- String.Concat: 0.392 seconds
As expected String.Concat is faster than StringBuilder in this example with few strings. But what about the + operator? It is several times faster than the other two functions. Normally the execution time has to be higher than the time of String.Concat because of the many copy operations.
The explanation for the short execution time of the + operator is simple. The compiler performs code optimization. He recognizes that fixed strings have to be concatenated and immediately replaced them by a corresponding composite string. An analysis of the IL code shows, the compiler converts the line „Hello“ + „“ + „World“ + „!“ directly into „Hello World!“. A concatenation of fixed strings is thus already been replaced by a concatenated string at compile time. This explains the high execution speed in the comparison test.
Performance comparison with variable values
After the first test and the considered code optimization by the compiler, of course, the question arises how the execution times behave when variable strings are used. Therefore, the test program has been modified and the individual concatenation functions were supplemented with variable string values. The code below shows the test program, again in a shortened version without timer and output. The variables value1 and value2 are function parameters. During the measurement the function call was done by passing the values “abc” and “def”.
1 int i;
2 int count = 10000000;
3 string value = null;
4 StringBuilder builder = null;
5
6 for (i = 0; i < count; i++)
7 {
8 value = "Hello" + value1 + " " + value2 +
9 "World" + value1 + "!" + value2;
10 }
11
12 for (i = 0; i < count; i++)
13 {
14 builder = new StringBuilder(25);
15
16 builder.Append("Hello");
17 builder.Append(value1);
18 builder.Append(" ");
19 builder.Append(value2);
20 builder.Append("World");
21 builder.Append(value1);
22 builder.Append("!");
23 builder.Append(value2);
24
25 value = builder.ToString();
26 }
27
28 for (i = 0; i < count; i++)
29 {
30 value = string.Concat(
31 "Hello", value1, " ", value2,
32 "World", value1, "!", value2);
33 }
The following measurements where done during the program execution (mean value of 10 measurements):
- + Operator: 1.568 seconds
- StringBuilder: 1.052 seconds
- String.Concat: 1.557 seconds
Surprisingly, the StringBuilder has reached the first place. This class seems to work very efficiently, so the internal management overhead is compensated already in these example where only a few strings have to be concatenated. The + operator is now behind the String.Concat function. Most likely, this time difference will grow with increasing number of strings.
Summary
In C# you can use several ways to concatenate strings. None of the available possibilities is generally to be preferred. There are better or worse solutions, depending on the actual use case.
The following best practices will help to select the best solution for an individual use case
Use Case 1: Only a few strings should be concatenated and the memory consumption and performance play a subordinate role.
You can usethe following rule: readability is more important than performance and memory consumption. You should choose the function which creates the most readable source code.
Use Case 2: There are many strings which should be concatenated. Therefore this has an impact on performance and memory consumption. Here, the selection of the right function should depend on the kind of values that are used.
- Concatenation of fixed values: use the + operator
- Concatenation of variable values: use Concat
- Successively concatenation of values which cannot or should not be done in a single function call: use Builder
Nice Article
11 for (i = 0; i < count; i++)
12 {
13 builder = new StringBuilder(25);
If you can move StringBuilder declaration out of the loop, the result would be different.
Pingback: What should you include in the code? - Exam 70-483 at ExamsDB