Skip to content

Modern C++ • Move Semantics

Generally speaking, the move semantics along with copy semantics refer to paradigms for resources acquisition between objects. Particularly, the move semantics refers to programming techniques that allow resources to be moved between objects rather than copied every time they’re acquired. On the other hand, copy semantics is based on the principle of making copies of the needed resources and can lead to problems in case deep full copies are performed. Suppose that we have a class Object that needs to acquire a resource, in the example represented by a vector but it can be something else. Then the standard way to do this is by implementing a copy constructor and an assignment operator, as in Source 1, following the RAII principles so we don’t incur nasty side effects


// Old C++

class Object {

        // The managed resource
	std::vector<int> resource;

public:
	Object() {}

	Object(const Object& o)
	{
		// The following code is equivalent to this->resource = o.resource
		// here it's made explicit the fact that the resource is acquired
		// by copy, with an (re)allocation if necessary.

		this->resource.resize(o.resource.size());

		std::copy(o.resource.begin(),
		          o.resource.end(),
		          this->resource.begin());
	}

	Object& operator=(const Object& o)
	{
		this->resource = o.resource;
		return *this;
	}
};

int main()
{
	Object o1;

	// o1 acquires its resource
	// ...

	// o2 acquires the resource from o1 by copy
	Object o2 = o1;

	return 0;
}

The resource is acquired from the source object by creating a full copy, eventually preceded by a (re)allocation. The source object can still be used to refer to the owned resource. This approach is fine if multiple instances of the same resource can be created and doing so does not pose performance issues. But consider now the following code


Object o1 = some_process();

where the object is, for example, created as a result of running some process and its resources acquired by copy. The some_process() function returns an Object instance, which is a temporary anonymous object (automatic for RAII purposes) used only to initialize o1 and then destroyed after that statement is executed. If the process creates resources with a big memory footprint then using this data just to make a copy and then discard it is not really efficient, especially if it’s done very frequently. Another source of issues may be the fact that the resource is scarce or limited (file handles, sockets, DB connections, etc.) or it’s a unique one (a singleton, an external device, a video/audio stream, etc.).
In such cases it would be more convenient to move ownership of the resource from the source object to the destination object if the source object no longer needs it. In code it would look something like the following


class Object {

    std::vector<int> resource;

public:
    Object() {}

    Object(const Object& o)
    {
           this->resource.resize( o.resource.size() );

           std::copy(o.resource.begin(),
                     o.resource.end(), 
                     this->resource.begin());
    }

    Object& operator=(const Object& o)
    {
           this->resource = o.resource;
           return *this;
    }

    // A convenient method to move resource data.
    // NOTE: the below operations are currently not possible on
    // std::vectors, they're just for demonstration use.
    void move_from(Object& o)
    {
           // Delete the old data, if any
           this->resource->delete();
           // Move the data from the source to this object
           this->resource->data( o.resource.data() );
           // Release the resource.
           o.resource.release();
     }

};

in the above code we have added a supplementary move_from(Object&) method that does the following:

1) deletes the current resource, if any (deallocate its memory)
2) moves the resource from the source object to this object (copy its memory location)
3) releases the resource in the source object (implementation-dependent)

After the resource is moved, the source object can no longer be used to access it and should be left into an “empty” valid state so that it can be safely destructed. In the above example, the release() method takes care of invalidating the resource in the source object, for example by setting the data to nullptr, but it may involve other operations needed to leave the source object in a valid empty state. With this method, every time we want to acquire ownership of the resources we could write something like this


o1.move_from(o2);

// or

o1.move_from( some_process() );

It would also be nice to have a mechanism that allows us to do so at object construction time as well, so that we can follow the RAII principle. We could do that in the copy constructor but its purposes (as the name itself implies) is not that of implementing move semantics, so there should be a mechanism to automatically disambiguate the cases where we need copy semantics and where we need move semantics using OOP paradigms, such as method overloading. C++11 has introduced such a mechanism natively: rvalue references.

To understand how it works we need to understand the concepts of rvalues and lvalues. These concepts have changed a bit during the course of history and new C++ standards have extended their meaning by introducing new *values. In the broader sense, an lvalue is any object to which we can assign a value and that can be directly referenced by its name. An rvalue is the result of an expression, like the values returned by mathematical operators, any constant expression, or the temporary objects created and returned by function calls, which are anonymous and cannot be directly referenced by name. Variables are the most typical example of lvalues. For example, in the following code


Object o1;
Object o2 = make_object(); //<- returns an Object

o1 is an lvalue since you can assign values to it (provided it implements an assignment operation) and it can be referenced by its name. On the other hand, the Object produced by make_object() is an rvalue, a temporary anonymous one. Note that although you can still write something like make_object() = ... and assign to it, the object won’t outlive that line of code where it’s created since it cannot be referenced by name. And in fact, another way to understand lvalues and rvalues is by their lifetime: lvalues, being named values, have a lifetime defined by the scope in which they’re defined, while rvalues’ lifetime is limited to the expression or statement that generates them. For a more in-depth explanation, have a look at the C++ Standard drafts where you can find edge cases and other esoteric details.

Modern C++ provides a mechanism to handle rvalues by introducing the rvalue reference T&& and the std::move() function. When using these two features in combination it is possible to capture rvalues (like temporary objects) or turn any object into an rvalue in order to move the resources they own rather than copying them. Returning to our previous example, we can forget about the imaginary move_from() method and extend the Object class with the new C++ features as follows


// Modern C++

class Object {

     std::vector<int> resource;

public:

     Object() = default;

     Object(const Object& o)
     {
          this->resource = o.resource;
     }

     Object(Object&& o)
     {
          this->resource = std::move( o.resource );
     }

     Object& operator=(const Object& o)
     {
          this->resource = o.resource;
          return *this;
     }

     Object& operator=(Object&& o)
     {
          this->resource = std::move( o.resource );
          return *this;
     }
};

where Object(Object&& o) is called the move constructor and operator=(Object&& o) the move assignment operator. Their arguments are rvalue references to an Object and together with the std::move(T&&) function they allow the implementation of the move semantics. Specifically, if we write the following code

Object o1;
Object o2 = o1;
Object o3 = some_process(); // returns an rvalue Object

the assignment at line 2 will trigger the copy constructor, since o2 is an lvalue and will be captured by the standard lvalue reference operator T&. On the other hand, the assignment at line 3 will trigger the move constructor because the expression on the right side is an rvalue (it generates a temporary anonymous object). In the move constructor we can then move the resource from the source object by using std::move(), or implement a set of operations like in our imaginary move_from(Object&) method if the object directly manages the resource.

The std::move() function does not actually move anything but merely “signals” that the resource (a vector in this case) can be moved. It is up to the moved resource to implement the necessary logic to actually move the data. Under the hood, std::move() simply performs a (static) cast on the object to an rvalue reference so that it will trigger the appropriate overloaded move method. So, in the above example the statement std::move( o.resource ); simply returns an rvalue reference to the source vector and triggers the move assignment operator of std::vector, where a set of operations similar to the ones in our imaginary move_from() method are performed to move the data.

All STL containers implement the move semantics to optimize resource acquisition. This can be directly verified with the following code

template<typename C>
C get_container() {
	return C();
}

std::vector<int> v1, v2;
std::list<int> l1, l2;
std::map<int, int> m1, m2;
std::deque<int> d1, d2;

v1 = v2;                                    // calls vector& operator=(const vector& rhs)
v1 = get_container<std::vector<int>>();     // calls vector& operator=(vector&& rhs)

l1 = l2;                                    // calls list& operator=(const list& rhs)
l1 = get_container<std::list<int>>();       // calls list& operator=(list&& rhs)

m1 = m2;                                    // calls map& operator=(const map& rhs)
m1 = get_container<std::map<int,int>>();    // calls map& operator=(map&& rhs)

d1 = d2;                                    // calls deque& operator=(const deque& rhs)
d1 = get_container<std::deque<int>>();      // calls deque& operator=(deque&& rhs)

// and others ...

That’s the basics 🙂

Published inModern C++