Skip to content

Modern C++ • Structured Binding

Probably one of the fanciest terms used in C++, structured binding is a feature introduced with C++17 and its purpose is that of decomposing objects into their constituent parts. Basically, this feature allows to take a “decomposable object” (we’ll define this more precisely later on) and bind each of its elements to a set of declared names (variables), as follows

a1, a2, ..., an = s

where the decomposable object s can be either an l-value or the result of an expression E producing an n-element object E -> {e1, e2, ..., en} (i.e. an r-value). This is something that can be done out of the box in many languages, for example in Python using sequence unpacking like so

# Python's "structured binding" (sequence unpacking)
 
s = [1, 0.5, "hello"]
a1, a2, a3 = s         # a1 is 'int', a2 is 'float', a3 is 'str'
print(a1, a2, a3)      # 1 0.5 hello

With old C++ this problem could only be solved by implementing custom solutions or resorting to external libraries (e.g. boost::tuples and boost::tie). Previous versions of the standard (C++11) have implemented the tier function into the language, so one could write something similar to the above as follows

std::tuple<int, double> s{1, 0.5};
int a{};
double b{};
std::tie(a, b) = s;

which allowed to decompose tuples into its elements and tie them to free variables. However, that solution isn’t so clean as it requires declaring the binding variables beforehand in separate statements and only works for tuples. C++17 has introduced a cleaner and more general way to achieve object decomposition by including a new construct into the language called structured binding.

 

The modern C++ solution: structured binding

Structured binding (SB for brevity) allows breaking down objects using a language construct, without resorting to external helpers. A structured binding is a C++ expressions in the following form (‘*’ indicates optional clauses)

<const|volatile|static>* auto <&|&&>* [<identifier-list>] = <expression>

where expression must be (or produce) a “decomposable object”, that is an object that meets one of the following conditions

  • A structure/class with all public non-static data members
  • A tuple-like object
  • An array

Let’s start with the first case, a user-defined struct/class with all public non-static data members. In this situation the decomposition is straightforward and works out of the box as all the necessary work is done automatically by the compiler

struct S {
   int e1;
   double e2;
};

S s{1, 0.5};
auto [a, b] = s;
 
// or also using braced-list-initialization
//auto [a, b]{s};
 
std::cout << "a=" << a << std::endl;  // a=1
std::cout << "b=" << b << std::endl;  // b=0.5

What really happens under the hood is that the compiler creates an “auxiliary” variable of type S to get the object s with the same qualifiers indicated in the SB clause (i.e. const, &, &&, etc.). It then declares and initializes the binding variables for each decomposable element of s (i.e. for each public data member), in order of declaration and with the same type, each one referencing the elements of the auxiliary variable, something similar to the below code

S _s_aux = s;
int& a = _s_aux.e1;
double& b = _s_aux.e2;

Note that s can be either an l-value, as in the above example, or r-value such as the result of a function. Also, depending on how the SB is qualified, a copy/move or a reference binding is performed to get the object, so pay attention to this detail. In our example, a copy of s is made as the SB is not ref-qualified.

This method, however, does not work as expected for hierarchical structures. For example, if we derive S from a class A, the decomposition cannot be done

struct A {
   int a1;
};

struct S : A {
   int e1;
   double e2;
};

S s{{1}, 2, 0.5};
auto [a, b, c] = s;    // ERROR: cannot decompose S

The three accessible elements in S cannot be bound to the variables of the initializer list. To figure out why, let’s see what the Standard has to say in this regard. In section 11.5.4 it is stated that the non-static data members of S (called E in the Standard)

…shall be public direct members of E or of the same unambiguous public base class of E… and the number of elements in the identifier-list shall be equal to the number of non-static data members of E.

So, in order for the binding to work, all accessible data members of S must be either all in S or all in the same base class of S. Which, as I read it, means that all of the bindings must be either all in A or all in S (which is considered base class of itself). That is, the following will work

struct A {
   int a1;
   int e1;
   double e2;
};

struct S : A {
};

S s{1, 2, 0.5};
auto [a, b, c] = s;    // OK

Maybe the reason for this is not immediately clear from the Standard, but it will make sense if we consider that to create a 1:1 mapping between the initializer list and the decomposed object it is necessary to establish an absolute order. With inheritance it is not possible to establish an unambiguous order if the elements are not all in the same class.

The auto specifier may be used with reference qualifiers to allow binding to both l-values and r-values

struct S {
   int e1;
   double e2;
};
 
S foo() { return S{0, 1.5}; }
 
S s{1, 0.5};
 
auto& [a, b] = s;            // OK, binding 's' to a reference
auto& [c, d] = foo();        // NO, dangling reference
const auto& [e, f] = foo();  // OK, binding an r-value to a const&
auto&& [g, h] = foo();       // OK, binding an r-value to an &&
  
++a;   // modifies s.e1
++b;   // modifies s.e2
++e;   // ERROR: 'e' is non-mutable

It is important to remember that the qualifiers & and && are applied to the auxiliary variable not the binding variables, which are references in any case and the type of which depends on the object’s type.

Decomposing arrays works the same as with user-defined types with all public members. Each element of the array is decomposed into the corresponding binding variable in the initializer-list (in order). Therefore, the number of elements in the array must be equal to the number of binding variables

int s[2]{1, 2};

auto [a, b] = s;
auto& [c, d] = s;

c++;    // modifies s[0]
d++;    // modifies s[1]

 

Tuple-like objects

If any of the data members of S is not accessible then the condition to decompose it as in the above cases is not met and structured binding does not work out of the box. For example, the following code will fail to compile

struct S {
   int e1;
   std::string e2;
 private:
   double e3;
};

S s{};
auto [a, b] = s;   // ERROR: cannot decompose S ('e3' is inaccessible)

In this case, in order for the object to be decomposable it must be a “tuple-like object”. A tuple-like object is basically any object that implements the “tuple-like binding protocol”, which is a standard specification used to generalize structured binding. The protocol requires the following steps to be followed

1. Declare the number of decomposable elements of the object
2. Declare the types of each decomposable data member
3. Define a getter function/method to access the decomposable data members

1. Can be done by specializing the std::tuple_size template class inside the std:: namespace. This class defines the number of accessible elements by defining a constant value, which must be equal to the number of elements in the initializer-list (2 in our example)

template<>
struct tuple_size<S> {
   static constexpr std::size_t value {2};
};
 
// or by deriving from std::integral_constant
 
template<>
struct tuple_size<S> : std::integral_constant<std::size_t, 2> {};

2. The type of each decomposable element is designated by specializing the std::tuple_element<i, T> template class, where i is the (0-based) index of the element. This class provides a type alias named type indicating the type of the element

template<>
struct tuple_element<0, S> {
   using type = int;
};
template<>
struct tuple_element<1, S> {
   using type = std::string;
};

3. Define a getter function get<i>(T) in the same namespace as S or, alternatively, a method S::get(). The compiler will lookup for the class method first and, if it is not resolved, the free function then.

template<std::size_t i>
auto get (const S& s) {  // 'auto' deduced as std::template_element<i, S>::type
   if constexpr (i == 0) return s.e1;
   if constexpr (i == 1) return s.e2;
}

Under these conditions, the above example will now work. Following is the complete code of the tuple-like binding protocol applied to that example

struct S {
   int e1         {1};
   std::string e2 {"Hello!"};
private:
   double e3      {1.5};
};
 
// Tuple-like binding protocol definition
namespace std {
 
   // Define the number of decomposable (accessible) elements
   // by specializing the 'tuple_size' template class
   template<>
   struct tuple_size<S>
    : std::integral_constant<std::size_t, 2> { };
 
   // Define the types of decomposable elements by specializing
   // the 'tuple_element' template class (indices are 0-based)
   template<>
   struct tuple_element<0, S> {
       using type = int;
   };
   template<>
   struct tuple_element<1, S> {
       using type = std::string;
   };
}
 
// Define a 'get' function to access the decomposable elements
template<std::size_t i>
auto get(const S& s) {
   if constexpr (i == 0) {
       return s.e1;
   }
   else if constexpr (i == 1) {
       return s.e2;
   }
   else {
       // ERROR ?
   }
}
 
int main(){
 
    S s{};
 
    auto [a, b] = s;
 
    std::cout << "a=" << a << std::endl;
    std::cout << "b=" << b << std::endl;
 
    return EXIT_SUCCESS;    
}

A quite clumsy process, in the good spirit of C++.

Conclusion

All in all, there are several improvements over the old std::tie approach. Firstly, it is undoubtedly cleaner because of the declare-and-tie syntax that doesn’t require preliminary declaration of the binding variables. Secondly, the types are automatically deduced with a single auto specifier. And, most importantly, we can use other structures other than tuples as decomposable objects (as long as some conditions are met).
As for the use cases of this feature, honestly, it’s mainly about improving the readability when returning multiple values from functions that need to be captured individually. But I wouldn’t be surprised if there are more esoteric use cases out there that I’m not aware of.

Published inModern C++