Modern C++ • Lambda Expressions

Prior to C++11 the only way to use something resembling a closure in C++ required either third party libraries (such as Boost Bind), rolling your own workarounds (usually with function objects) or using proprietary compiler-specific extensions. I remember the first time i stumbled into a C++ closure in the good old days when Borland (R.I.P) added the __closure extension to their C++ compiler in order to implement the event-driven architecture of their RAD framework (the legendary Borland C++ Builder, the first real RAD C++ IDE in history). The implementation was limited though as it could only capture the context of a single object and not that of generic code blocks (the extension basically packed the this pointer with a function pointer into a 64-bit variable).

For those coming to C++ from scripting languages such as Javascript, Python, PHP and the like, where closures are a native construct, the lack of this useful and powerful feature must have been really disheartening. Closures provide a clean mechanism to capture state and attach it to a block of code “on the fly” for deferred processing, very useful for building event-driven applications for example, so native support that eliminates the implementation of clumsy workarounds is beneficial to the developers community.

Before delving into the details, let’s recall some useful concepts that will allow us to better understand the underlying mechanism. There is often confusion between the terms closure and lambda expression (or lambda function) which are often used interchangeably. In reality they refer to two different things, even though related of course. A lambda expression is the lexical definition of an anonymous function, that is a function that is not identified by a name, while a closure is the instantiation by the compiler of a lambda expression bound to its enclosing scope. Generally speaking, a closure is a programming technique to bind a block of code to the variables in the scope where the block is defined (a function with state). The set of variables bound to the lambda function form the environment of the closure when the lambda function is instantiated. The following picture depicts the whole concept

A C++ lambda expression has the following syntax

[<captures>*](<params>)* <specs>* <exc>* <ret-type>* {<body>}

where the clauses are defined as follows (* denotes optional clauses)

<captures>        ::=  <capture-default> | 
                       <capture-list> | 
                       <capture-default>, 
                       <capture-list>
<capture-default> ::=  & | =
<capture-list>    ::=  <capture> | <capture-list>, <capture>
<capture>         ::=  variable | &variable | this
<params>          ::=  param | <params> param
<specs>           ::=  mutable | constexpr
<exc>             ::=  throw() | noexcept
<ret-type>        ::=  -> return-type
<body>            ::=  lambda-code

In the above expression, [] specifies the capture environment, that is the variables captured from the enclosing scope, which can be a default “catch-all” [&] indicating that all variables are captured by reference, or [=] indicating that all variables are captured by value, or it can be a list of specific captured variables by value or reference [var1, &var2, var3, ..., this], including the this pointer, or a default catch-all and a list of specific variables, or it can be void [] indicating that no variables are captured. A list of parameters passed to the lambda function may optionally follow, along with the optional specifiers mutable|constexpr, an optional exception specification throw()|noexcpt, an optional return type, and finally the body of the lambda function.

Let’s have a look at some example code

auto closure = []{ std::cout << "I'm a lambda function!" << std::endl; };
 
closure(); // prints "I'm a lambda function!"

the above code defines a lambda expression that does not capture anything, has no parameters and does not return a value. This lambda expression generates a closure object, which is an rvalue of an unspecified unique type generated at the point of occurrence and automatically inferred with the auto keyword. The closure can then be used like a normal function using the function operator to invoke the lambda code.

Technically, the closure is internally implemented using a function object along with other mechanisms to construct the environment, along the lines of the following code

class __lambda_1768387__
{
 public: 
  void operator()() const
  {
    std::cout << "I'm a lambda function!" << std::endl;
  }
};

__lambda_1768387 __closure = __lambda_1768387__{};
closure();

but all this verbose implementation is hidden behind a more concise native construct that’s way easier to use. This means that C++ closures are “callables” and can be wrapped into generic function objects, like in the following code

// Create a function object that "wraps" the lambda
std::function<void()> f = []{ std::cout << "I'm a lambda function!" << std::endl; };

f(); // prints "I'm a lambda function!"

This is, however, an indirection used to treat the lambda as a generic object that can be called using function-style syntax and one of the few ways to work around the fact that the type of a lambda is only known to the compiler. Let’s see some more examples

int x = 1, y = 2, z = 3;

// A lambda with a default capture by reference returning an int
auto f = [&]() -> int { return x + y + z; };

std::cout << f() << std::endl; // prints 6

x = 2;

std::cout << f() << std::endl; // prints 7

In the above code we define a lambda function that captures all the variables in the enclosing scope by reference and returns an integer (the return type can be omitted if the compiler can unambiguously infer it). As can be seen, if any of the captured variables is modified the changes will be reflected in the closure’s environment. This fact should serve as a stark reminder that closures are generally not isolated self-contained objects. Variables captured by reference introduce dependencies tied to the captured variables’ lifetime so one must be cautious when using them from within a lambda function as they may have been modified outside or even no longer exist.

int x = 1, y = 2, z = 3;

// A lambda with a default capture by value returning an int
auto f = [=](){ return x + y + z; };

std::cout << f() << std::endl; // prints 6

x = 2;

std::cout << f() << std::endl; // prints 6

The code above is the same as the previous one but this time the scope’s variables are captured by value. We can also provide a comma-separated list of specific variables to be captured (in any order) by mixing by-reference and by-value captures, like in the following example

int x = 1, y = 2, z = 3;

// A lambda capturing x by value and y by reference
auto f = [x, &y](){ return x + y; };

std::cout << f() << std::endl; // prints 3

y = 5;

std::cout << f() << std::endl; // prints 6

Default captures can be used together with capture lists in the form [capture-default, capture-list] to specify that some variables must not be captured with the default method. Note that none of the variables in the capture-list must match the default (they must all be captured with another method).

int x = 1, y = 2, z = 3;

// Default capture by reference, excluded x and z
auto f = [&,x,z](){ return x + y + z; };

// Default capture by value, excluded z
auto f2 = [=, &z](){ return x + y + z; };

// ERROR: x matches the default capture
auto f3 = [=, x, &y](){ return x + y + z; };

Lambda functions can take a list of parameters, as any other ordinary function. C++11 did not allow default parameters in lambda expressions, but since C++14 that’s now possible.

int x = 1, y = 2, z = 3;

// Default capture by value with parameters
auto f = [=](int p, int d){ return std::pow(x + y + z, p) + d; };

// Since C++14 default parameters are supported
auto f2 = [=](int p, int d = 0){ return std::pow(x + y + z, p) + d; };

std::cout << f(3, 5) << std::endl;  // prints 221
std::cout << f2(3) << std::endl;    // prints 216

It is also possible to compose lambda functions by using them as parameters and return values. This technique allows working with so-called High-Order-Lambdas. Since the type of a lambda function is unspecified we have to use them as function objects in order to be passed around. The below code illustrates this and the comments explain in detail what happens under the hood

int main()
{
    int x = 10;

     // This lambda captures the variables of the enclosing scope (main(){}),
     // that is 'x', which become part of its environment => Env(h) = {x}
     auto h = [=](int a){

          // This nested lambda captures the variables of its enclosing lambda,
          // that is 'a', and the outer scope (main(){}) => Env(nested) = {a} U {x}
          auto nested = [=]() -> int {
                return x + a + 1;  
          };
          return nested;
     };

     // Calls the lambda function h() and get the returned lambda function
     // as a function object. Since the parameter 'a' is in g()'s environment
     // the value 3 will be used by g()
     auto g = h(3);

     // Prints the result (14)
     std::cout << g() << std::endl;

     // This lambda captures nothing and accepts a lambda function in one of
     // its parameters (as a function object), returning an int
     auto f = [](std::function<int()>& g, int v) -> int {
          return g() - v;
     };

     // Calls f() by passing g() and the result of executing g(), which is the
     // same as before since its environment hasn't been modified
     std::cout << f(g, g()) << std::endl; // prints 0

     return 0;
}

Consider now the following code

int x = 1;

// ERROR
auto f = [x](){ x++; return x; };

If you try to compile this code it will complain about x not being modifiable. This is because the operator() of the generated closure’s class is by default declared as const. To allow modification of variables captured by value within the lambda the mutable keyword must be added to the lambda expression, as follows

int x = 1;

// OK
auto f = [x]() mutable { x++; return x; };

Lambda functions can be called immediately without first assigning them to a variable by using a trailing function call operator (), like in the following code

// This lambda is called immediately when the code is executed ...
[] { std::cout << "I'm called immediately!" << std::endl; }();

// ... and this one too
[](int a, int b) { std::cout << "The result is " << a + b << std::endl; }(1, 2);

// ... and also this one
int x = 5;
[x](){ std::cout << "The capture is " << x << std::endl; }();

When used within classes, lambda functions can access the class data members by capturing the this pointer. The this pointer can be captured either implicitly or explicitly. When captured implicitly, the object it points to is always captured by “reference” (technically, by copying the this pointer). When captured explicitly, it can be captured by reference with [this] or by value with [*this] (since C++17), in which case a copy of the object will be created (useful if the lambda will be called asynchronously at a later stage when the original object may no longer exist). The following example demonstrates these concepts

class MyClass
{
    void print() const
    {
        std::cout << "You just called me from a lambda!" << std::endl;
    }

public:

    void call_me()
    {
        // Captures the this pointer explicitly by "reference"
        [this] { print(); }();

        // Captures the this pointer explicitly by value (since C++17)
        [*this] { print(); }();

        // Captures the this pointer implicitly by "reference"
        [=] { print(); }();

        // Same as above
        [&] { print(); }();
    }
};

int main()
{
    MyClass hey;
    hey.call_me();

    return 0;
}

// Output:
// "You just called me from a lambda!"
// "You just called me from a lambda!"
// "You just called me from a lambda!"
// "You just called me from a lambda!"

One of the most common use cases for lambdas is as predicates in arguments to other functions that will use them in their execution. For example, many algorithms in the Standard Library accept lambdas as a parameter that will be used when the algorithm is executed. While this could be achieved by other means as well (e.g. with function objects) doing it with lambdas is certainly more clean and streamlined. A few examples are shown in the below code

// Old C++ (using only language constructs and a pure OOP approach)

class print_f
{
   public: void operator()(const std::string& name) {
       std::cout << name << " is " << name.length() << " characters long.\n";
   }
};

class start_with_C_f
{
   public: bool operator()(const std::string& name) {
       return name.at(0) == 'C';
   }
};

int main(int argc, char** argv)
{
    std::vector<std::string> names;
    names.push_back("Albert");
    names.push_back("Christian");
    names.push_back("Jocelyn");

    std::for_each(names.begin(), names.end(), print_f());

    auto name = std::find_if(names.begin(), names.end(), start_with_C_f());

    std::cout << "Found " << (*name) << std::endl;

    return 0;
}

// Modern C++

int main(int argc, char** argv)
{
    std::vector<std::string> names {
        "Albert", "Christian", "Jocelyn"
    };

    std::for_each(names.begin(), names.end(), [](const std::string& name)
    {
        std::cout << name << " is " << name.length() << " characters long.\n";
    });

    auto name = std::find_if(names.begin(), names.end(), [](const std::string& name)
    {
        return name.at(0) == 'C';
    });

    std::cout << "Found " << (*name) << std::endl;

    return 0;
}

But the power of lambdas goes far beyond these simple examples and a more thorough presentation would require a series of articles. Their versatility is probably not immediately apparent from the few cases illustrated here. They provide an easy mechanism to capture and transmit scoped state bound to a block of code in an immediate, clean and intuitive manner without cluttering the code with verbose workarounds (or external libraries).

Regarding their use cases, it seems that this feature is quite controversial among C++ programmers. It’s either underestimated or overused. And honestly, it’s very easy to misuse them to get the opposite effects of what they were designed for. If you find yourself writing lots of lambdas that are quite complex or that do the same things, then you don’t need lambdas in the first place but plain old free functions. However, for local short computations, writing in-line functions can make the code more readable. For example, with STL algorithms that require predicates, writing lambda functions make their use more straightforward. They are also very useful in the development of asynchronous systems, as callbacks, to perform non-trivial initialization of variables and to create on-the-fly validation functions, among other things.