functor vs. named lambda vs. anonymous lambda

A lambda expression is a mechanism for specifying a function. The primary use for a lambda is to specify a simple action. A lambda expression can be implemented directly as anonymous function, for example within an algorithm method, or it can be named and used by its name.

Of course, there are already some other possibilities to implement functions. Therefore, we may think about the question when we should use lambdas and when we should not use them. Furthermore, we have to decide between named and anonymous lambdas. Following I will give you some guidelines which will help do find a decision.

This comparison is based on a simple use case: we want to sort a list of signed integers by looking at the absolute values and we want to sort in descending order. There exists an STL algorithm for sorting. So, we use this algorithm and provide a comparison function for two elements. At first, we implement this function by using a functor.

struct AbsoluteDescendingComparer
{
  bool operator()(int a, int b)
  {
    return abs(a) > abs(b);
  }
};

int _tmain(int argc, _TCHAR* argv[])
{
  auto data = std::vector {-5, 1, 15};
    
  std::sort(data.begin(), data.end(), AbsoluteDescendingComparer());
    
	return 0;
}

 

At next we implement the same function by using an anonymous lambda. This will give us the possibility to write the comparison directly within the sort algorithm and therefore we can create very short and compact source code.

int _tmain(int argc, _TCHAR* argv[])
{
  auto data = std::vector  {-5, 1, 15};

  std::sort(data.begin(), data.end(), [](int a, int b) { return abs(a) > abs(b); } );

  return 0;
}

 

Another possible implementation will be a named lambda. We can create this lambda first and use it later within the algorithm.

int _tmain(int argc, _TCHAR* argv[])
{
  auto data = std::vector  {-5, 1, 15};

  auto AbsoluteDescendingComparer = [](int a, int b)
  {
    return abs(a) > abs(b);
  };

  std::sort(data.begin(), data.end(), AbsoluteDescendingComparer);

  return 0;
}

 

Comparison of the implementations

If we now ask our self which of the three alternatives we prefer, I’m sure, most will say: the anonymous lambda. That’s because it allows to write the code in a very compact way without any overhead. But what if we ask about our preference in case we have to maintain code? In this case the decision may not that easy anymore. The anonymous lambda has one major disadvantage: to leave out the function name will result in a loss of information. The function name, if it is chosen wisely, offers a lot of information about the implemented function. Without this name, we have to understand the implementation itself to know what it is used for. In case we use an anonymous lambda we have to add a comment to explain the implementation.

Another issue which decreases the code readability and makes it more difficult to understand and maintain the code, is based on the mixing of the two concerns. We have the sort-algorithm and we have a comparer. By using a named lambda we have a clear separation of concerns. Instead, if we use an anonymous lambda, these two functionalities will be mixed together.

The separation of concerns will become even more importand in case the function itself becomes more complex. In such a case, an anonymous lambda as well as an named one are not suitable anymore because both are placed in the local scope of the including method. If such a method becomes too complex it should be refactored by extracting code into separate methods. Therefore, complex functionality should not be implemented by using a lambda.

Recommendations

For a simple functionality, needed in local scope only, a lambda can be used. Otherwise use a functor, a class member function or a non-member function. If you use a lambda I would recommend preferring the named lambda. An anonymous one should only be used if the functionality is simple enough to understand it in the same time it needs to understand the meaning of a function name.

 

Advertisements
Veröffentlicht unter C++ | 1 Kommentar

std::vector vs c-style array

Within this article I want to compare the base array type and the vector offered by the std library. I want to call these two types “c-style array” and “std::vector”. The comparison will show you the pros and cons of the different types and give you a guideline when to use what.

The STL offers another base container, the std::array. I don’t want to pay attention to this container within this comparison for two reasons. At first, it is very easy to decide whether you want to use a std::vector or a std::array. You should use std::array when the container size is known at compile time. If this size isn’t known at compile time or you container may grow or shrink, you should use std::vector. At second, a comparison between std::array and c-style array is important, but this is an own topic and I don’t want to mix different topics within one article. In a previous article you can find an according comparison

 

Use cases

The c-style array is by default a fixed size container. Therefore you may ask whether it makes sense to compare it with a std::vector. Within the following table I will show the base use cases for containers and the according STL type or c-style type which may be used for implementation.

Use Case STL Container c-style container
The size is fixed. std::array array on stack

(e.g. “int someArray[5]”)

The container will be created by actual data. Therefore, initialization has variable size. After this creation, the data will be used only and therefore the size becomes fixed. std::vector array on heap

(e.g: int *someArray = new int[5];”)

The size is variable and can be grow or shrink during runtime. std::vector array on heap

(and maybe some management data)

 

As you can see, the std::vector is used for all cases with a variable size container. As the STL does not contain a specialized container for the second use case we only have to distinguish between variable and fixed size containers.

The c-style array can be used to implement all three use cases. But now we can distinguish between the second and the third use case. To implement the size changes during runtime we may need an additional management component for the array grow and shrink functionality.

 

Base concept of std::vector

A vector is the base container. Bjarne Stroustrup suggest: “By default, use Vector when you need a container”. Therefore, if you don’t have any good reasons to use another container type, you should use a vector.

A vector is a dynamic array that goes on the heap. Its size does not need to be known at compile time. If you add or remove items, the underlying array changes size.

You can think of a std::vector as a wrapper to „int *someArray = new int[5]“ where a std::array is a wrapper to „int someArray[5]“.

 

Initialize, change, read

The following example application shows the initialization of containers with three elements. The vector size isn’t set explicitly. If you add an element to a std::vector it will reserve additional memory of necessary.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int values1[3] = {1, 2, 3};

  values1[1] = 15;

  std::cout << "size: " << (sizeof(values1) / sizeof(*values1)) << std::endl;

  for (auto element : values1)
  {
    std::cout << element << std::endl;
  }
  
  // std vector
  std::vector values2 {4, 5, 6};
  
  values2[1] = 25;

  std::cout << "size: " << values2.size() << std::endl;

  for (auto element : values2)
  {
    std::cout << element << std::endl;
  }
}

 

Within this first example you can see that the syntax for the variable declaration is slightly different, but element access and usage within a range based for loop is the same for both types. Furthermore, the example shows a first big difference of both containers: the std::vector knows its size. For the c-style array you have to create an additional variable to store the array size or, as shown in the example, you can calculate the size by dividing the memory reserved for the array by the memory needed to store a single element.

 

Use as method parameter

In a real application, you will split up your program logic into functions and therefore a standard use case is to pass parameters to functions. So, let’s change the example a little bit and create functions which should print the containers.

void PrintArray(int* values, int size)
{
  std::cout << "size: " << size << std::endl;

  for (int i = 0; i < size; ++i)
  {
    std::cout << values[i] << std::endl;
  }
}

void PrintVector(const std::vector &values)
{
  std::cout << "size: " << values.size() << std::endl;

  for (auto value : values)
  {
    std::cout << value << std::endl;
  }
}

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int values1[3] = { 1, 2, 3 };

  PrintArray(values1, 3);

  // std vector
  std::vector values2{ 4, 5, 6 };

  PrintVector(values2);
}

 

This simple use case shows a very important difference between a c-style array and a std::vector. The c-style array will be passed as pointer to its first parameter. The std::vector instead offers the standard copy semantic. Therefore, you can use standard coding practices and for example pass the parameter by const reference. So, the std::vector doesn’t convert to a pointer unless you explicitly ask it to.

A c-style array is passed as pointer to the first element. This principle is called “pointer decay”. As the array decays to a pointer, the size information is loss. Therefore, we have to pass an additional function parameter which contains the array size. The previously shown calculation depending on the memory size will no longer work, as the memory size of the passed parameter is the size of a pointer now. Furthermore, as the size information is lost, the range based for loop cannot be used anymore.

You can see, this little design change of the application, has nearly no impact if you use a std::vector. You can simple refactor your code and extract the according code section as method. But if you try to do the same for a c-style array, you will get in trouble and as result of the pointer decay you have to change your implementation fundamentally. Within the following article you can see some other impacts of the pointer decay, especially in base and derived class relationships: Arrays and inheritance, a source of errors.

 

Use as return value

C-style arrays are among the few types which cannot be used as L-value. So, c-style arrays cannot be used on the left side of an assignment and therefore they cannot be used as return values too. But you can return a pointer to a c-style array. Unfortunately, this will result in with two big issues: you have to know the size of the created array outside of the method and you have to implement the resource management based on this wild pointer. Both issues are a good source for errors.

A std::vector supports the c++ copy semantic. Therefore, it can be used as return value. The following example shows possible implementation for both kinds of arrays.

int* CreateArray()
{
  int* values = new int[3] { 1, 2, 3 };
  return values;
}

std::vector CreateVector()
{
  std::vector values = { 4, 5, 6 };
  return values;
}

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int* array1 = CreateArray();
  int size = 3;   // you have to know the size of the array
  
  for (int i = 0; i < size; ++i)
  {
    std::cout << array1[i] << std::endl;
  }
  
  delete[] array1; // you have to delete the array
  
  // std vector
  std::vector values2 = CreateVector();
  
  for (auto element : values2)
  {
    std::cout << element << std::endl;
  }
}

 

Range check

One of the base concepts of c and c++ is to provide build in types without any overhead which could result in performance loss or higher memory usage. Therefore, such types do not contain a management layer with range checks and similar features. Of course, this missing secure access layer could result in implementation errors which cause undefined behavior of your application.

The std::vector expands this concept and provides a combination of the advantages of both ideas. In favor of performance it comes without any management layer in release build and on the other hand it offers a secure access layer in debug builds. If you create a debug build, the std::vector contains range checks. This will help to find implementation errors. The following source code shows an according example. If you execute this example as debug build, the erroneous std::vector access will throw an out of range exception.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int values1[3] = { 1, 2, 3 };

  values1[5] = 15;   // undefined behavior
   
  // std vector
  std::vector values2 {4, 5, 6};
   
  values2[5] = 25;  // out of range exception in debug build  
}

 

Standard algorithms

The std::vector provides an STL-consistent interface and can be used with the STL algorithms. The STL algorithms also provide support for c-style arrays. If you have c-style arrays with known size, you can directly use them together with the algorithms. As shown in the previous examples, pointer decay will result in a loss of the size information and therefore in a loss of direct support of algorithms. But you can use a template based approach to keep the size information of a c-style method parameter.

The following example shows how to use both container types in STL algorithms. In this case the sort algorithm is used. Furthermore, it shows an implementation to pass a c-style array as a reference to template function, which keeps the size information and allows the use of STL algorithms.

// template based function which keeps size information 
// of c-style array and therefore allows to use STL algorithms
template 
void SortArray(T(&values)[Size])
{
  std::sort(std::begin(values), std::end(values));  
}

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int values1[3] = { 7, 5, 3 };
  
  std::sort(std::begin(values1), std::end(values1));

  for (auto element : values1)
  {
    std::cout << element << std::endl;
  }
  
  // std vector
  std::vector values2 = { 7, 5, 3 };

  std::sort(values2.begin(), values2.end());
  
  for (auto element : values2)
  {
    std::cout << element << std::endl;
  }
}

 

Memory usage

A std::vector is a wrapper for a c-style array with additional management data needed for the grow and shrink functionality. This management data needs a few bytes extra space. Therefore, compared with a c-style array, the std::vector needs more memory. In most cases this extra memory can be ignored especially if you use large containers. Moreover, if you want to implement a c-style array which can grow and shrink you have to store additional management data too.

In my opinion, there is one use case where this higher memory usage will be crucial: You need a lot of small containers with fixed size during lifetime but variable size on initialization. In this case the grow and shrink functionality of the std::vector with its according management data is unnecessary overhead as it is not needed during lifetime of the container. It is only needed because the initialization size of the container is not known at compile time. If you have to create many of such containers and if they contain a small amount of data, a c-style array may be a good choice. But even in this case I would recommend the use of a c-style array only if you have a high memory workload which results in real issues. If you don’t have measurable issues you should use a std::vector anyway. The higher memory need is a small price which you should pay to get all the nice advantages of a std::vector which will help you to create more robust applications.

 

Performance

The C++ core guidelines suggest to use a std::vector instead of a c-style array (see “SL.10: Prefer using STL array or vector instead of a C array”). That’s because the std::vector offers a lot of advantages without a performance loss. Therefore, it does have the same performance as a raw c-style array.

But unfortunately, there is a persistent myth that for run-time speed, one should use c-style arrays. One reason for this myth are examples which compare c-style array and std::vector speed. Such examples will sometimes show a huge performance difference and they explain this difference as result of the std::vector management functionality for the size flexibility. These examples have one common problem: they use the std::vector in a wrong way. This poor implementation causes the performance loss. Following we will see some typical examples.

A second reason for this myth to persist is when comparing c-style arrays and std::vector in debug mode. As shown previously, the std::vector does range checks during debug mode. Of course, this will reduce the performance. The c-style array performance instead is indifferent between debug and release mode. Therefore, performance comparisons in debug mode are not meaningful.

As mentioned before, we will now look at some typical implementation issues which reduce the performance of a std::vector. After I have seen some of the examples with supposedly bad performance of the std::vector, I found two main issues: slow write and slow read commands. The following source code shows a typical example for a slow write and a slow read implementation and a right way to implement it.

If you create the std::vector and fill it with data, you should set an expected size. Normally you know the approximately or average size for your container. If you set this size during creation you avoid unnecessary grow operations. The second part of the example shows two ways to access the std::vector elements: you can use the “at()” method or you can use the “[]” operator. But these are not two different spellings for the same thing. There is a different functionality too. The “at()” method will execute an additional range check. And this range check is done in release build too. Therefore, it will add functionality for a secure element access but of course, this extra functionality comes with performance costs.

int _tmain(int argc, _TCHAR* argv[])
{
  // reserve memory before adding elements
  // e.g. you know the vector will have 100 up to 150 elements, 
  // then already reserve space for 100 elements
  std::vector values;
  values.reserve(100);
   
  int x = 125;  // example only, will be a variable size of 100 to 150
  for (int i = 0; i < x; i++)
  {
    values.push_back(i);
  }

  // dont use at() except you explicitly want a range check
  int element1 = values.at(17);   // range check on debug and release build
  int element2 = values[17];      // no range check
}

 

By avoiding these two implementation errors, a lot of the examples showing performance issues can be proved wrong. Let me mention two more performance thoughts before we finish this topic. Another important fact is that you should try to avoid inserting an element in front of the std::vector. For an insert at the front, a copy of the whole vector is necessary. If you cannot avoid such inserts you may use some other container. But do performance measurements in this case. Other containers lie a list will support an insert at the front but they come with a different memory management which results in performance loss in other situations. Therefore, even for a use case with (sporadic) insert at front, the std::vector may be the best choice.

A further performance improvement can be reached by using “emplace_back()” instead of “push_back()” while inserting into a std::vector. To keep it simple you can say: emplace back moves an object into a container where push back copies it. If the object supports this movement you will get a performance gain. You don’t have to worry about whether the object supports movement or not. If it isn’t supported the function will copy it anyway. So, emplace back can be used in any case and it will be as fast as or faster as push back, depending on the object you want to add to the vector.

 

Summary

If you need a container with variable size, your choice will be easy for normal use case. Let me repeat Bjarne Stroustrup’s suggestion: “By default, use Vector when you need a container”. I would recommend the same and suggest using another container type only if it adds some benefit in your special use case. And please keep in mind: This benefit must be measurable, especially if you expect a performance gain. Unfortunately, often the choice for a container type is based on belief and not on measurable facts. Finally I will finish the article with a short summary of the pros and cons of a std::vector compared to a c-style array.

Advantages of std::vector

  • Perfectly matches with the strict type system concept of c++
  • Copy semantic / no pointer decay / the array doesn’t convert to a pointer unless you explicitly ask it to
  • It knows its size
  • Range check in debug builds
  • No memory overhead beyond what it needs to hold its elements
  • Same performance as a c-style array
  • Direct support of use in STL algorithms

Disadvantages of std::vector

  • Slightly higher memory need as additional management data must be stored. This disadvantage is only true in case you have the second of the three main use cases shown at the beginning, with a variable size initialization but a fixed size during use of the container.
Veröffentlicht unter C++ | Kommentar hinterlassen

std::array vs c-style array

Within this article I want to compare the base array type and the array offered by the std library. I want to call these two types “c-style array” and “std::array”. The comparison will show you the pros and cons of the different types and give you a guideline when to use which kind of array.

Base concept of std::array

The std::array is designed as zero-overhead wrapper for c-style arrays. It will provide a value like semantics equally to the other C++ containers. A std::array should have same runtime performance as a c-style array.

Within the next chapters I want to compare both array types by looking at some base use cases.

Initialize, change and read

The following example application shows the initialization of an array with three elements. Furthermore, an array element will be changed and all elements will be read.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int array1[3] = {1, 2, 3};

  array1[1] = 15;

  std::cout << "size: " << (sizeof(array1) / sizeof(*array1)) << std::endl;

  for (auto element : array1)
  {
    std::cout << element << std::endl;
  }
  
  // std-style array
  std::array array2 = { 4, 5, 6 };

  array2[1] = 25;

  std::cout << "size: " << array2.size() << std::endl;

  for (auto element : array2)
  {
    std::cout << element << std::endl;
  }
}

Within this first example you can see that the syntax for the variable declaration is slightly different, but element access and usage of array in a range based for loop is the same for both types. Furthermore, the example shows a first big difference of both types: the std::array knows its size. For the c-style array you have to create an additional variable to store the array size or, as shown in the example, you can calculate the size by dividing the memory reserved for the array by the memory needed to store a single element.

Array as method parameter

In a real application, you will split up your program logic into functions and therefore a standard use case is to pass parameters to functions. So, let’s change the example a little bit and create functions which should print the array.

void PrintArray(int* values, int size)
{
  std::cout << "size: " << size << std::endl;

  for (int i = 0; i < size; ++i)
  {
    std::cout << values[i] << std::endl;
  }
}

void PrintArray(const std::array &values)
{
  std::cout << "size: " << values.size() << std::endl;

  for (auto& value : values)
  {
    std::cout << value << std::endl;
  }
}

int _tmain(int argc, _TCHAR* argv[])
{
  int array1[3] = { 1, 2, 3 };
  PrintArray(array1, 3);
  
  std::array array2 = { 4, 5, 6 };
  PrintArray(array2);
}

This simple use case shows a very important difference between a c-style array and a std::array. The c-style array will be passed as pointer to its first parameter. The std::array instead offers the standard copy semantic. Therefore, you can use standard coding practices and for example pass the parameter by const reference. So, the std::array doesn’t convert to a pointer unless you explicitly ask it to.

A c-style array is passed as pointer to the first element. This principle is called “pointer decay”. As the array decays to a pointer, the size information is loss. Therefore, we have to pass an additional function parameter which contains the array size. The previously shown calculation depending on the memory size will no longer work, as the memory size of the passed parameter is the size of a pointer now. Furthermore, as the size information is lost, the range based for loop cannot be used anymore.

You can see, this little design change of the application, has nearly no impact if you use a std::array. You can simple refactor your code and extract the according code section as method. But if you try to do the same for a c-style array, you will get in trouble and as result of the pointer decay you have to change your implementation fundamentally. Within the following article you can see some other impacts of the pointer decay, especially in base and derived class relationships: Arrays and inheritance, a source of errors.

std::array as method parameter with variable size

Maybe you already noticed one limitation of the print function we implemented for the std::array. The function will accept an array with three integer values only. It will not work with array of other types or other sizes.

At first you may think this is a limitation or disadvantage of std::array. But it isn’t. Moreover it is a big advantage of this data type as std::array will perfectly match with the C++ base design which uses a strict type system. We have a fixed size as part of type declaration. So each std::array with a different size is a different type. This will allow the compiler to detect and prevent implementation errors.

Of course, a strict type system will make it difficult to create higher-order functions. For these cases c++ offers function overloading, for example by implementing templates.

The following source code shows a template based implementation which allows to write a print function which accepts a std::array of any size. Furthermore, you will see a second function which is even more generic as it accepts any kind of container object.

template
void PrintArray(const std::array &values)
{
  for (auto value : values)
  {
    std::cout << value << std::endl;
  }
}

template
void PrintContainer(T &values)
{
  for (auto value : values)
  {
    std::cout << value << std::endl;
  }
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::array array2 = { 4, 5, 6 };
  PrintArray(array2);
  PrintContainer<std::array>(array2);
}

c-style array as method parameter with size information

Of course, the template concept can also be used for a c-style array. Instead of passing a pointer to the first element, you can pass the array as a reference to template function. This will allow to pass the array with its specific size.

template 
void PrintArray(T(&values)[Size])
{
  for (auto value : values)
  {
    std::cout << value << std::endl;
  }
}

int _tmain(int argc, _TCHAR* argv[])
{
  int array1[3] = { 1, 2, 3 };
  PrintArray(array1);
}

Array as return value

C-style arrays are among the few types which cannot be used as L-value. So, c-style arrays cannot be used on the left side of an assignment and therefore they cannot be used as return values too. But you can return a pointer to a c-style array. Unfortunately, this will result in with two big issues: you have to know the size of the created array outside of the method and you have to implement the resource management based on this wild pointer. Both issues are a good source for errors.

A std::array supports the c++ copy semantic. Therefore, it can be used as return value. The following example shows possible implementation for both kinds of arrays.

int* CreateArray()
{
  int* array = new int[3] { 1, 2, 3 };
  return array;
}

std::array CreateStdArray()
{
  std::array array = { 4, 5, 6 };
  return array;
}

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int* array1 = CreateArray();
  int size = 3;   // you have to know the size of the array

  for (int i = 0; i < size; ++i)
  {
    std::cout << array1[i] << std::endl;
  }

  delete[] array1; // you have to delete the array
   
  // std-style array
  std::array array2 = CreateStdArray();

  for (auto element : array2)
  {
    std::cout << element << std::endl;
  }
}

In both cases you can create factory methods for arrays of any length. In case of a c-style array I would prefer the length as input parameter and in case of the std::array I would suggest the template based approach, shown in a previous section about method parameters.

If the client does not know the resulting size of the array and the factory method itself should be responsible to create a collection of the needed size, you should not use a std::array at all. The array is a collection of fixed size but you want to use a collection of variable size. This sound like a use case for the std::vector. At this moment, I don’t want to go further into details as this doesn’t match with the topic of this article. But in a further article I want to continue with a comparison of c-style array and std::vector.

Range check

One of the base concepts of c and c++ is to provide build in types without any overhead which could result in performance loss or higher memory usage. Therefore, arrays do not contain a management layer with range checks and similar features. Of course, this missing secure access layer could result in implementation errors which cause undefined behavior of your application.

The std::array expands this concept and provides a combination of the advantages of both ideas. In favor of performance it comes without any management layer in release build and on the other hand it offers a secure access layer in debug builds. If you create a debug build, the std::array contains range checks. This will help to find implementation errors. The following source code shows an according example. If you execute this example as debug build, the erroneous std::array access will throw an out of range exception.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int array1[3] = {7, 5, 3};
  
  std::sort(std::begin(array1), std::end(array1));

  for (auto element : array1)
  {
    std::cout << element << std::endl;
  }
  
  // std-style array
  std::array array2 = { 7, 5, 3 };

  std::sort(array2.begin(), array2.end());
  
  for (auto element : array2)
  {
    std::cout << element << std::endl;
  }
}

Performance and memory usage

As mentioned before, the std::array is designed as zero-overhead wrapper for c-style arrays. Therefore, both types are optimized regarding performance and memory usage. The std::array has an identical memory layout to c-style arrays. Furthermore, they have the same performance, except in debug builds. As shown in the previous chapter, the std::array contains range checks in debug build and therefore it has a slightly lower performance. In release build the performance of std::array and c-style array are same. By definition, the std::array is a simple aggregate containing a c-style array as its only member.

Array and standard algorithms

The std::array provides an STL-consistent interface and can be used with the STL algorithms. The STL algorithms also provide support for c-style arrays. If you have c-style arrays with known size, you can directly use them together with the algorithms. As shown in the previous examples, pointer decay will result in a loss of the size information and therefore in a loss of direct support of algorithms. But again, you can use the template based approach to keep the size information of a c-style method parameter.

The following example shows how to use both array types in STL algorithms. In this case the sort algorithm is used.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int array1[3] = {1, 2, 3};

  array1[5] = 15;   // undefined behavior
   
  // std-style array
  std::array array2 = { 4, 5, 6 };

  array2[5] = 25;   // out of range exception in debug build
}

Multi-dimensional arrays

The std::array as well as the c-style array support multi-dimensions. These multi-dimensions are designed as arrays of nested arrays. The syntax to declare a std::array directly reflects this nested design. But this comes with the downside of a more awkward syntax than compared to a c-style array. The following example shows how to declare, write and read a multi-dimensional array.

int _tmain(int argc, _TCHAR* argv[])
{
  // c-style array
  int array1[2][3] = { { 1, 2, 3 }, { 4, 5, 6} };

  array1[0][1] = 10;
  
  for (auto &row : array1)
  {
    for (auto &element : row)
    {
      std::cout << element << std::endl;
    }
  }     

  // std-style array
  std::array<std::array, 2> array2{ { { 7, 8, 9 }, { 10, 11, 12 } } };

  array2[0][1] = 20;

  for (auto &row : array2)
  {
    for (auto &element : row)
    {
      std::cout << element << std::endl;
    }
  }
}

Summary

The std::array is designed as zero-overhead wrapper for c-style arrays. Therefore, it comes with all the advantages of a c-style array and ads the additional advantages of C++ containers. I would recommend to use std::arrays except you have a rare special use case which comes with good reasons to use a c-style array.
Advantages of std::array

  • Perfectly matches with the strict type system concept of c++
  • Copy semantic / no pointer decay / the array doesn’t convert to a pointer unless you explicitly ask it to
  • It knows its size
  • Range check in debug builds
  • No memory overhead beyond what it needs to hold its elements
  • Same performance as a c-style array
  • Direct support of use in STL algorithms

Disadvantages of std::array

  • Slightly more awkward syntax for declaration of the array, especially when creating multi-dimensional arrays
Veröffentlicht unter C++ | 1 Kommentar

STL Algorithms

The STL algorithms offer a powerful way to solve some base programming use cases. If you know and use these algorithms you can write more clean, efficient and robust code. But unfortunately, it isn’t that easy to know or find the right algorithm for a use case. In my opinion, this is a result of the not practical kind of documentation – especially grouping of the algorithms is not related to practice – and on the other hand the naming of some functions. Therefore, I wrote a series of articles with a practice-related overview about the STL algorithms. Following you will find the overview about this article series with links to the articles.

Algorithms to use, find and check container elements

Algorithms to copy, move and remove container elements

Algorithms to change and sort container elements

Algorithms for sorted containers

Algorithms for two containers

Operations which can be used in STL algorithms

Create own STL algorithms wrapper

Veröffentlicht unter C++ | Kommentar hinterlassen

Create own STL algorithms wrapper

Within the previous articles about STL algorithms I introduced the main algorithms sorted by their use case. This analysis of the actual STL algorithms has shown the power of the algorithms but also their issues. The main issues are misleading names and a sometimes-cumbersome notation.

Of course, it is easy to write source code by using the algorithms, but it is sometimes hard to read and understand existing code. Notation matters more than we usually like to believe. Therefore, it may be useful to create an own algorithm wrapper and hide the misleading names and the cumbersome notation behind an easy to use interface.

Within this article I want to show some examples for algorithms wrapper. These wrapper functions add a self-explaining interface and therefore you don’t have to write comments to explain the algorithms. If you use these algorithms without a wrapper, you must write comments to each algorithm to explain its behavior.

 

Example: remove elements from container

The algorithm “remove” is used to remove elements from a container. But unfortunately, this algorithm will not remove the elements. Instead it will shift the elements to delete to the end of the container and returns an iterator to the new logical end. So, you always must explicitly erase the elements after calling the algorithm. I think that’s a perfect candidate for a wrapper function.

template
void RemoveElement(C& container, E element) 
{ 
  // remove (shifts elements but dont deletes them)
  auto newEndIterator = std::remove(container.begin(), container.end(), element);
  
  // delete elements
  container.erase(newEndIterator, container.end());
}

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data{ 5, 7, 9, 2, 4, 9, 3 };

  RemoveElement(data, 9);

  Print(data);  // output is '5 7 2 4 3'

  return 0;
}

 

Example: count equal pairs in two containers

With the algorithm “inner_product” you can – like the name reveals – calculate an inner product. But you can specify own operations and then the function will offer a common map/reduce functionality according to the given algorithm. So, if you set an algorithm the name “inner_product” is wrong as the function no longer calculates the inner product.

The following example shows how to count the equal pairs of two containers. If you put this function into a self-explaining wrapper function, the code will become much easier to understand.

template
int CountEqualPairs(C& container1, C& container2)
{
  return std::inner_product(
    container1.begin(), container1.end(), container2.begin(),
      0, std::plus(), std::equal_to());    
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data1{ 5, 7, 9, 2, 4, 9, 3 };
  std::vector data2{ 1, 0, 9, 8, 4, 3, 3 };
  
  int result = CountEqualPairs(data1, data2);

  std::cout << result << std::endl;   // result is '3'

  return 0;
}

 

Example: pairwise multiplication

The algorithm “transform” is another good candidate for a wrapper. This algorithm can be used for a wide range of use cases and therefore it has a cumbersome notation. So, if you read an implementation using this algorithm you must stop for a short (or sometimes long) moment and think about the purpose of the source code. In such cases you are happy if the developer has written a comment and explains the behavior of the algorithm. But often such a comment is missing.

As explained above, a self-explaining wrapper can help to write clean code which is easy to understand and to maintain. The following example shows a possible implementation for a pairwise multiplication.

template
C PairwiseMultiply(C& container1, C& container2)
{
  C target;

  std::transform(
    container1.begin(), container1.end(),
    container2.begin(),
    std::back_inserter(target),
    std::multiplies());  

  return target;
}

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data1{ 5, 7, 9, 2, 4, 9, 3 };
  std::vector data2{ 1, 0, 9, 8, 4, 3, 3 };
  std::vector target;
  
  target = PairwiseMultiply(data1, data2);

  Print(target);  // output is '5 0 81 16 16 27 9'

  return 0;
}

 

Summary

The STL algorithms are sometimes hard to understand. If you read source code and you find an algorithm, you have to stop for a moment and think about its purpose. A wrapper library helps to hide these implementation details behind a self-explaining interface. This will allow you to create more readable and maintainable source code.

Veröffentlicht unter C++ | 1 Kommentar

Operations which can be used in STL algorithms

Within the previous articles about STL algorithms I introduced the main algorithms sorted by their use case. Within this article I want to give an overview about the STL operations. These operations can be used stand-alone and within STL algorithms. The combination of the existing algorithms and operations will allow to solve complex implementations of some standard use cases in an easy way.

 

Compare two elements

Algorithm Description
equal_to Check whether the two elements are equal.
not_equal_to Check whether the two elements are not equal.
greater Check whether the first element is greater than the second one.
less Check whether the first element is smaller than the second one.
greater_equal Check whether the first element is greater or equal to the second one.
less_equal Check whether the first element is smaller or equal to the second one.
max Returns the larger one of the two elements.
min Returns the smaller one of the two elements.
minmax Returns a pair which contains the smaller element followed by the larger element.

 

Within the examples the operations “greater” and “less” are used to sort a container. The third example makes use of “max” operator to merge two containers by taking the maximum value of each element pair.

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data{ 5, 7, 9, 2, 4 };

  // greater
  std::sort(data.begin(), data.end(), std::less());
  Print(data);  // output is '2 4 5 7 9'

  // less
  std::sort(data.begin(), data.end(), std::greater());
  Print(data);  // output is '9 7 5 4 2'

  // max
  std::vector data1{ 5, 7, 9, 2, 4 };
  std::vector data2{ 3, 8, 1, 5, 8 };
  std::vector target;
  std::transform(data1.begin(), data1.end(), 
    data2.begin(), std::back_inserter(target), 
    [](int a, int b) {return std::max(a, b); });
  Print(target);  // output is '5 8 9 5 8'

  return 0;
}

 

Search elements in container

Algorithm Description
max_element Returns an iterator to the largest element of the container.
min_element Returns an iterator to the smallest element of the container.
minmax_element Returns a pair which contains an iterator to the smallest element of the container followed by an iterator to the largest element of the container.

 

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data{ 5, 7, 9, 2, 4 };

  // max_element
  auto iterator = std::max_element(data.begin(), data.end());
  if (iterator != data.end())
  {
    std::cout << "max element " << *iterator << " found at index " << std::distance(data.begin(), iterator) << std::endl;
    // output is 'max element 9 found at index 2'
  }

  return 0;
}

 

Calculations with one input

Algorithm Description
negate Returns the negation of the element.
clamp Returns the given element with respect to a value range. If the given element is outside of the range the returned value will be equal to the lowest or highest boundary.

Example: “clamp(x,20,30)” means the returned value must be in range 20 to 30. If x is less than 20 “20” will be returned. If x is between 20 to 30 “x” will be returned. If x is larger than 30 “30” will be returned.

 

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector data{ 5, -7, 9, -2, 4 };
  
  // negate
  std::transform(data.begin(), data.end(), data.begin(), std::negate());
  Print(data);  // output is '-5 7 -9 2 -4'

  return 0;
}

 

Calculations with two inputs

Algorithm Description
plus Returns the sum of the two elements.
minus Returns the difference of the two elements.
multiplies Returns the product of the two elements.
divides Returns the quotient of the two elements.
modulus Returns the residual value of the division of first element by second element.
logical_and Returns the result of a logical-and calculation of the two elements.
logical_or Returns the result of a logical-or calculation of the two elements.
logical_not Returns the result of a logical-not calculation of the two elements.
bit_and Returns the result of a bitwise-and calculation of the two elements.
bit_or Returns the result of a bitwise-or calculation of the two elements.
bit_xor Returns the result of a bitwise-xor calculation of the two elements.
bit_not Returns the result of a bitwise-not calculation of the two elements.

 

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector source1{ 5, 7, 9, 2, 4 };
  std::vector source2{ 3, 1, 2, 5, 7 };
  std::vector target;

  std::vector source3{ 1, 1, 0, 0 };
  std::vector source4{ 1, 0, 1, 0 };

  // sum
  std::transform(source1.begin(), source1.end(), source2.begin(),
    std::back_inserter(target), std::plus());

  Print(target);  // output is '8 8 11 7 11'

  // logical or
  target.clear();

  std::transform(source3.begin(), source3.end(), source4.begin(),
    std::back_inserter(target), std::logical_or());

  Print(target);  // output is '1 1 1 0'

  // bitwise xor
  target.clear();

  std::transform(source3.begin(), source3.end(), source4.begin(),
    std::back_inserter(target), std::bit_xor());

  Print(target);  // output is '0 1 1 0'
  
  return 0;
}
Veröffentlicht unter C++ | 1 Kommentar

Algorithms for two containers

The STL algorithms offer a powerful way to solve some base programming use cases. If you know and use these algorithms you can write more clean, efficient and robust code. But unfortunately, it isn’t that easy to know or find the right algorithm for a use case. In my opinion, this is a result of the not practical kind of documentation – especially grouping of the algorithms is not related to practice – and on the other hand the unfavorable naming of some functions. Therefore, I write this series of articles and want to give a more practice-related overview about the STL algorithms.

Within this article I want to give an overview about the STL algorithms for sorted containers. This is one of a series of articles. Within each of these blog post I want to summarize the STL algorithms regarding one kind of use case. This article shows the possibilities in case we have a container with sorted elements or in case we want to create such a container based on two sorted containers.

Each section contains a list of the according STL algorithms with a short description. Additional you will find an example implementation which will show some of these algorithms. If you copy and execute the examples, you will see the results within the console window. Additional, you will see the results as comment within the source code.

Most of the standard algorithms take half-open sequences. “Half-open” means the start iterator refers to the first element of the sequence and the end iterator refers to the element behind the last element. Returning the end iterator of the sequence to indicate “not found” is a standard-library convention.

 

Use elements of two containers

Algorithm Description
transform Create a resulting container by using the elements of two containers within a transformation function.

The first container is the controlling one. Therefore, the resulting container has same size as the first source container and the second source container must not be smaller than the first one.

inner_product Calculate the inner product of two containers. The inner product is the sum of all element pair products.
inner_product (with own operations) The default version of this function, without operations, calculates the inner product, like described above. But if you set specific operations, you can create several very different map/reduce algorithms. Therefore, the name “inner product” is misleading in this case.

If you set the operations, you must define two of them. The second one will be used on the pairwise elements of the containers and returns the result of an according calculation based these two given elements. The returned value will then be used within the first operation. This accumulation operation gets this result and the actual accumulation value as input and calculates a new accumulation value.

Within the example you see two implementations. The first one calculates the product of all pairwise differences and the second one counts the pairs of equal elements.

 

void Print(const std::vector& data)
{
  std::for_each(data.begin(), data.end(), [](int element){ std::cout << element << " "; });
  std::cout << std::endl;
}

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector source1{ 5, 7, 2, 2, 4 };
  std::vector source2{ 1, 1, 2, 5, 3, 5 };
  std::vector target;

  // transform
  std::transform(
    source1.begin(), source1.end(), 
    source2.begin(), 
    std::back_inserter(target),
    [](int a, int b){return a*b; });

  Print(target);  // output is '5 7 4 10 12'  

  // inner product (standard)
  int sumOfProducts = std::inner_product(source1.begin(), source1.end(), source2.begin(), 0);
  std::cout << sumOfProducts << std::endl;   // result is '38'

  // inner product (with own functions)  
  int productOfDifferences = std::inner_product(
    source1.begin(), source1.end(), source2.begin(), 
    0, std::plus(), std::minus());
  std::cout << productOfDifferences << std::endl;   // result is '8'

  int numberOfEqualPairs = std::inner_product(
    source1.begin(), source1.end(), source2.begin(), 
    0, std::plus(), std::equal_to());
  std::cout << numberOfEqualPairs << std::endl;   // result is '1'

  return 0;
}

 

Compare two containers

Algorithm Description
equal Check whether both containers are equal. They are equal if they have the same size and the same elements with same sort order.
mismatch Find the first difference within the two containers.

The function returns an iterator-pair of first position with different elements within the two containers. The resulting pair contains the iterators to the element of each list. The result contains one or two end-iterator if no mismatch was found. Till c++14 the second container must not be shorter than the first one. Since c++14 the second container may be shorter and if no mismatch was found the iterator pair contains the end-iterator of the shorter container and corresponding iterator of the other container.

is_permutation Checks whether a container is a permutation of the other one.

That means, it checks whether the containers have the same elements but the sort order may be different.

For example: “2 4 3 1 5” is a permutation of “1 2 3 4 5”.

lexicographical_compare Checks whether the first container is lexicographically less than the second one and return “true” in this case. Therefore, if the function returns “false” the containers either equal or the second one is less than the first one. In this case you can execute the lexicographical comparison again with changed containers or call the “equal” function to finish the comparison.

The containers are compared element by element. The first mismatching element defines which container is less or greater than the other one. If there are no mismatch, the smaller container is the less one. If the elements are equal and the containers sizes are equal, the containers are equal and the function will return “false”.

The function name is misleading as it isn’t a complete and finished comparison. In such a case, a three-state return value is needed. This function checks whether the first container is less and therefore, in my opinion, the function name should be “is_lexicographical_less” or even “is_less” as a function name should not reflect the technical details of its implementation.

C++20 may contain the function “lexicographical_compare_3way” which is a real comparison and returning the three-state return value.

search Searches a sub-sequence within a container and returns the iterator of the first position found or returns end-iterator if the sub-sequence isn’t found.
find_end Like “search”, this function searches a sub-sequence within a container. But in searches from the end and returns the last position found.

 

int _tmain(int argc, _TCHAR* argv[])
{
  std::vector v1{ 5, 7, 9, 2, 4 };
  std::vector v2{ 5, 7, 9, 2, 4 };  
  
  // equals  
  bool result = std::equal(v1.begin(), v1.end(), v2.begin());
  std::cout << result << std::endl;   // result is 'true'

  // mismatch
  v2 = { 5, 7, 0, 2, 0 };
  auto iterators = std::mismatch(v1.begin(), v1.end(), v2.begin());
  if ((iterators.first != v1.end()) && (iterators.second != v2.end()))
  {
    std::cout << "mismatch found at index " << std::distance(v1.begin(), iterators.first) << std::endl;
    // output is 'mismatch found at index index 2'
  }
  
  // search
  v2 = { 9, 2 };
  auto iterator = std::search(v1.begin(), v1.end(), v2.begin(), v2.end());  
  if (iterator != v1.end())
  {
    std::cout << "sub-sequence found at index " << std::distance(v1.begin(), iterator) << std::endl;
    // output is 'sub-sequence found at index index 2'
  }  

  return 0;
}
Veröffentlicht unter C++ | 1 Kommentar