RValue Reference Declarator: &&

The rvalue reference is a nice c++ feature to create efficient source code. Within this article i want to explain what is meant with an rvalue and how you can use the reference declarator. Furthermore you will learn how to use this feature to implement a move constructor and functions which moves parameters.

 

LValue and RValue

In the earliest days of C the lvalue was defined as an expression that may appear on the left or on the right hand side of an assignment, whereas an rvalue is an expression that can only appear on the right hand side of an assignment. For example within the assignment “int a = 21;” the expression “a” is a lvalue and “21” is a rvalue. Of course the lvalue “a” may also be placed on the right hand side of an assignment. For example in assignment “int b = a” both expressions are lvalues.

In C++ this definition is still useful as a first intuitive approach. But we can add another point of view: Lvalues are named objects that persist beyond a single expression and rvalues are unnamed temporaries that evaporate at the end of the expression.

The following example will show some lvalues and rvalues.

int _tmain(int argc, _TCHAR* argv[])
{
  int x;
  int y;
  
  // lvalues
  std::string a;
  std::string* b = &a;
  ++x;

  // rvalues
  123;
  x + y;
  std::string("rvalue");
  x++;

  return 0;
}

 

The first two assignments are very intuitive. The expressions create the named objects “a” and “b” which are lvalues. The first three rvalue examples are easy to understand too. “123” and the result of “x+y” are unnamed objects which are no longer accessible after the end of the expression. The same is true for the created string “rvalue”. An string object is created but it is not named and not accessible after the expression. But you may be astonished by locking at the increment operator. Why is “++x” an lvalue and “x++” an rvalue?

The expression “++x” is an lvalue because it modifies and then names the persistent object. “x++” instead will create a copy of the persistent object, increases the object value and returns the copy. Therefore the expression “x++” will return an unnamed not persistent object, an rvalue.

This little example shows a very important aspect about the difference of lvalues and rvalues. It is not about what an expression does, it is about what an expression names:  something persistent or something not persistent which only exists temporary within the expression. And as persistent objects can be addressed you can also say: If you can address an expression it is an lvalue and if you cannot it is an rvalue. For example “&++x” is valid whereas “&x++” is not.

In summary you may name the following definition: “An lvalue is an expression that refers to a memory location and allows us to take the address of that memory location via the & operator. An rvalue is an expression that is not an lvalue.”

 

RValue reference

A reference to an rvalue can be created by using the double address operator &&. Similar to standard references, or to me more precise lvalue references, you can create rvalue references. This may be used to pass rvalues to functions. Later on we will see how ravlue references will allow us to write a move constructor.

As an rvalue is a temporary object which is only valid in small scope, for example in one expression only, an rvalue reference is a reference to an argument that is about to be destroyed.

 

Constructor performance

Before we start to think about optimization of constructors by using rvalue references we want to have a look at the main issue of standard copy constructors.

To understand the issue we can think about the following: Let’s say you get a folder with a sheet of paper. You shall create a second folder and copy the sheet of paper. What will you do if you get the additional information that the first folder is never used anymore and thrown away? With this information you don’t have to photocopy the sheet of paper. You just have to move the original one to the new folder.

Exactly the same issue can be found in source code. By locking at above example we can now say: The most unnecessary copies are those where the source is about to be destroyed.

With this idea in mind, look at the following line of code, where s1 to s3 are strings.

string x = s1 + “ ” + s2 + “ ” + s3;

By executing this line of code a lot of temporary strings are created and therefore a lot of copy operations are done. Of course this expression will be executed in microseconds and may not need an optimization. But the example shows the general concept which is also valid for large and complex objects. If we come back to the idea we had above, we know that we don’t have to create a copy if the source is about to be destroyed. What does this mean for the string concatenation example? At the start of the expression we concatenate s1 with a blank (s1 + “ ”). In this case it is necessary to create a new temporary string because s1 is an lvalue naming a persistent object. Therefore a copy of its content has to be created. But in the next step we add s2 to this new temporary string created by s1 + “ ”. A second temporary string can be created and a copy of the first one can be concatenated with s2. Afterwards we throw the first temporary string away. And that’s the issue. We create the photocopy of the sheet of papers and throw the origin away. As the first temporary string, which was the result of s1 + “ “, is a rvalue referring to a temporary object we can move the origin content and don’t have to create a copy. This is the key concept of move semantics.

Before we go forward and looking at move constructors someone may think: As we can create rvalue references we have access to the temporary objects. What if we use these references to access the objects later on?

This is a good question. C++ is a language where the developer should have a maximum flexibility. Therefore the language itself will not forbid doing such wrong implementations. If we go back to the initial examples: Your boss told you that the origin folder with the sheet of paper is no longer needed and he will through it away after you have created the new folder. What if he has changed his mind and will use the origin folder later on? This will not work as he now has an empty folder. The same will happen if you access rvalues after their scope. They may contain invalid content or maybe they have pointers to memory locations already used by other objects. Therefore using rvalues after their scope is an implementation issue and may result in critical errors.

 

Move constructors

The standard copy constructors may help to reduce the issue of unnecessary copies but they cannot remove all of them. Move constructors which use rvalue references can help you improve the performance of your applications by eliminating the need for unnecessary memory allocations and copy operations. In general, being able to detect modifiable rvalues allows you to optimize ressorce handling.  If the objects referred to by modifiable rvalues own any resources, you can steal their resources instead of copying them, since they’re going to evaporate anyways.

The following example shows a typical move constructor. The parameter is an rvalue reference to the class. Inside the move constructor you will move the resources from the source object to the new object and you should release the data reference of the origin object to prevent the destructor from releasing them multiple times. As the new object has taken over the resources, the new object is responsible to release them.

class MyClass
{
public:
  MyClass(MyClass&& source) : mData(nullptr)
  {
    // move data
    mData = source.mData;

    // release source data
    // so the destructor does not free the memory multiple times
    source.mData = nullptr;
  }
private:
  std::vector<int>* mData;
};

 

To understand the behavior of the move constructor we want to look at a second example. We will now implement the example from above with the folder containing a sheet of paper. So we implement a folder class containing a vector with strings. In case of the move constructor we want to move the resources from one object to the other. Furthermore I have implemented a second constructor which gets the vector as input parameter. This initialization constructor will also use an rvalue reference to the source data.

class Folder
{
public:
  Folder(){};

  Folder(Folder&& source)
    : mData(std::move(source.mData))
  {    
  }

  Folder(std::vector<std::string>&& data)
    : mData(std::move(data))
  {    
  }

  void ShowSize()
  {
    std::cout << mData.size() << std::endl;
  };

private:
  std::vector<std::string> mData;
};


int _tmain(int argc, _TCHAR* argv[])
{    
  std::vector<std::string> data;
  data.push_back("abc");

  Folder original(std::move(data));
  Folder copy(std::move(original));

  std::cout << data.size() << std::endl;
  original.ShowSize();
  copy.ShowSize();
  
	return 0;
}

 

If we execute the application, the output shows us the size of the different vectors. The initial vector and the one inside the first folder are empty and only the new folder contains the resources. That’s because the move constructor of the vector will move the resources and resets the origin ones.

Within the initializer list of the constructors and on calling the constructors you will find a new function not explained so far: std::move. So we will proceed to look at this functions.

 

std::move

The function std::move enables you to create the rvalue reference to an existing object. Alternatively, you can use the static_cast keyword to cast an lvalue to an rvalue reference: static_cast(mySourceObject);

But why had we use this function? Let us start with the constructor call from above example: Folder copy(std::move(original));

The original object we want to copy is an lvalue. Therefore if we use this object as parameter, the standard copy constructor is called. By using the move function we get the rvalue reference to the object and can pass it to the move constructor. Within the constructor we initialize the vector. Here we have to follow the same principle. If we pass the vector it is an lvalue and a copy is created. But if we convert it to an rvalue reference we can call the move constructor of the vector.

 

Summary

Understanding the concept of rvalues and rvalue references will allow you to create and use move constructors. These move constructors can help you improve the performance of your applications by eliminating the need for unnecessary memory allocations and copy operations.

Werbung
Dieser Beitrag wurde unter C++ veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit deinem WordPress.com-Konto. Abmelden /  Ändern )

Facebook-Foto

Du kommentierst mit deinem Facebook-Konto. Abmelden /  Ändern )

Verbinde mit %s