Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Users' Guide

Getting Started
Fronts Ends: Defining Terminals and Non-Terminals of Your EDSL
Intermediate Form: Understanding and Introspecting Expressions
Back Ends: Making Expression Templates Do Useful Work
Examples
Background and Resources
Glossary

Compilers, Compiler Construction Toolkits, and Proto

Most compilers have front ends and back ends. The front end parses the text of an input program into some intermediate form like an abstract syntax tree, and the back end takes the intermediate form and generates an executable from it.

A library built with Proto is essentially a compiler for an embedded domain-specific language (EDSL). It also has a front end, an intermediate form, and a back end. The front end is comprised of the symbols (a.k.a., terminals), members, operators and functions that make up the user-visible aspects of the EDSL. The back end is made of evaluation contexts and transforms that give meaning and behavior to the expression templates generated by the front end. In between is the intermediate form: the expression template itself, which is an abstract syntax tree in a very real sense.

To build a library with Proto, you will first decide what your interface will be; that is, you'll design a programming language for your domain and build the front end with tools provided by Proto. Then you'll design the back end by writing evaluation contexts and/or transforms that accept expression templates and do interesting things with them.

This users' guide is organized as follows. After a Getting Started guide, we'll cover the tools Proto provides for defining and manipulating the three major parts of a compiler:

Front Ends

How to define the aspects of your EDSL with which your users will interact directly.

Intermediate Form

What Proto expression templates look like, how to discover their structure and access their constituents.

Back Ends

How to define evaluation contexts and transforms that make expression templates do interesting things.

After that, you may be interested in seeing some Examples to get a better idea of how the pieces all fit together.

Getting Proto

You can get Proto by downloading Boost (Proto is in version 1.37 and later), or by accessing Boost's SVN repository on SourceForge.net. Just go to http://svn.boost.org/trac/boost/wiki/BoostSubversion and follow the instructions there for anonymous SVN access.

Building with Proto

Proto is a header-only template library, which means you don't need to alter your build scripts or link to any separate lib file to use it. All you need to do is #include <boost/proto/proto.hpp>. Or, you might decide to just include the core of Proto (#include <boost/proto/core.hpp>) and whichever contexts and transforms you happen to use.

Requirements

Proto depends on Boost. You must use either Boost version 1.34.1 or higher, or the version in SVN trunk.

Supported Compilers

Currently, Boost.Proto is known to work on the following compilers:

  • Visual C++ 8 and higher
  • GNU C++ 3.4 and higher
  • Intel on Linux 8.1 and higher
  • Intel on Windows 9.1 and higher
[Note] Note

Please send any questions, comments and bug reports to eric <at> boostpro <dot> com.

Proto is a large library and probably quite unlike any library you've used before. Proto uses some consistent naming conventions to make it easier to navigate, and they're described below.

Functions

All of Proto's functions are defined in the boost::proto namespace. For example, there is a function called value() defined in boost::proto that accepts a terminal expression and returns the terminal's value.

Metafunctions

Proto defines metafunctions that correspond to each of Proto's free functions. The metafunctions are used to compute the functions' return types. All of Proto's metafunctions live in the boost::proto::result_of namespace and have the same name as the functions to which they correspond. For instance, there is a class template boost::proto::result_of::value<> that you can use to compute the return type of the boost::proto::value() function.

Function Objects

Proto defines function object equivalents of all of its free functions. (A function object is an instance of a class type that defines an operator() member function.) All of Proto's function object types are defined in the boost::proto::functional namespace and have the same name as their corresponding free functions. For example, boost::proto::functional::value is a class that defines a function object that does the same thing as the boost::proto::value() free function.

Primitive Transforms

Proto also defines primitive transforms -- class types that can be used to compose larger transforms for manipulating expression trees. Many of Proto's free functions have corresponding primitive transforms. These live in the boost::proto namespace and their names have a leading underscore. For instance, the transform corresponding to the value() function is called boost::proto::_value.

The following table summarizes the discussion above:

Table 31.1. Proto Naming Conventions

Entity

Example

Free Function

boost::proto::value()

Metafunction

boost::proto::result_of::value<>

Function Object

boost::proto::functional::value

Transform

boost::proto::_value


Below is a very simple program that uses Proto to build an expression template and then execute it.

#include <iostream>
#include <boost/proto/proto.hpp>
#include <boost/typeof/std/ostream.hpp>
using namespace boost;

proto::terminal< std::ostream & >::type cout_ = { std::cout };

template< typename Expr >
void evaluate( Expr const & expr )
{
    proto::default_context ctx;
    proto::eval(expr, ctx);
}

int main()
{
    evaluate( cout_ << "hello" << ',' << " world" );
    return 0;
}

This program outputs the following:

hello, world

This program builds an object representing the output operation and passes it to an evaluate() function, which then executes it.

The basic idea of expression templates is to overload all the operators so that, rather than evaluating the expression immediately, they build a tree-like representation of the expression so that it can be evaluated later. For each operator in an expression, at least one operand must be Protofied in order for Proto's operator overloads to be found. In the expression ...

cout_ << "hello" << ',' << " world"

... the Protofied sub-expression is cout_, which is the Proto-ification of std::cout. The presence of cout_ "infects" the expression, and brings Proto's tree-building operator overloads into consideration. Any literals in the expression are then Protofied by wrapping them in a Proto terminal before they are combined into larger Proto expressions.

Once Proto's operator overloads have built the expression tree, the expression can be lazily evaluated later by walking the tree. That is what proto::eval() does. It is a general tree-walking expression evaluator, whose behavior is customizable via a context parameter. The use of proto::default_context assigns the standard meanings to the operators in the expression. (By using a different context, you could give the operators in your expressions different semantics. By default, Proto makes no assumptions about what operators actually mean.)

Proto Design Philosophy

Before we continue, let's use the above example to illustrate an important design principle of Proto's. The expression template created in the hello world example is totally general and abstract. It is not tied in any way to any particular domain or application, nor does it have any particular meaning or behavior on its own, until it is evaluated in a context. Expression templates are really just heterogeneous trees, which might mean something in one domain, and something else entirely in a different one.

As we'll see later, there is a way to create Proto expression trees that are not purely abstract, and that have meaning and behaviors independent of any context. There is also a way to control which operators are overloaded for your particular domain. But that is not the default behavior. We'll see later why the default is often a good thing.

"Hello, world" is nice, but it doesn't get you very far. Let's use Proto to build a EDSL (embedded domain-specific language) for a lazily-evaluated calculator. We'll see how to define the terminals in your mini-language, how to compose them into larger expressions, and how to define an evaluation context so that your expressions can do useful work. When we're done, we'll have a mini-language that will allow us to declare a lazily-evaluated arithmetic expression, such as (_2 - _1) / _2 * 100, where _1 and _2 are placeholders for values to be passed in when the expression is evaluated.

Defining Terminals

The first order of business is to define the placeholders _1 and _2. For that, we'll use the proto::terminal<> metafunction.

// Define a placeholder type
template<int I>
struct placeholder
{};

// Define the Protofied placeholder terminals
proto::terminal<placeholder<0> >::type const _1 = {{}};
proto::terminal<placeholder<1> >::type const _2 = {{}};

The initialization may look a little odd at first, but there is a good reason for doing things this way. The objects _1 and _2 above do not require run-time construction -- they are statically initialized, which means they are essentially initialized at compile time. See the Static Initialization section in the Rationale appendix for more information.

Constructing Expression Trees

Now that we have terminals, we can use Proto's operator overloads to combine these terminals into larger expressions. So, for instance, we can immediately say things like:

// This builds an expression template
(_2 - _1) / _2 * 100;

This creates an expression tree with a node for each operator. The type of the resulting object is large and complex, but we are not terribly interested in it right now.

So far, the object is just a tree representing the expression. It has no behavior. In particular, it is not yet a calculator. Below we'll see how to make it a calculator by defining an evaluation context.

Evaluating Expression Trees

No doubt you want your expression templates to actually do something. One approach is to define an evaluation context. The context is like a function object that associates behaviors with the node types in your expression tree. The following example should make it clear. It is explained below.

struct calculator_context
  : proto::callable_context< calculator_context const >
{
    // Values to replace the placeholders
    std::vector<double> args;

    // Define the result type of the calculator.
    // (This makes the calculator_context "callable".)
    typedef double result_type;

    // Handle the placeholders:
    template<int I>
    double operator()(proto::tag::terminal, placeholder<I>) const
    {
        return this->args[I];
    }
};

In calculator_context, we specify how Proto should evaluate the placeholder terminals by defining the appropriate overloads of the function call operator. For any other nodes in the expression tree (e.g., arithmetic operations or non-placeholder terminals), Proto will evaluate the expression in the "default" way. For example, a binary plus node is evaluated by first evaluating the left and right operands and adding the results. Proto's default evaluator uses the Boost.Typeof library to compute return types.

Now that we have an evaluation context for our calculator, we can use it to evaluate our arithmetic expressions, as below:

calculator_context ctx;
ctx.args.push_back(45); // the value of _1 is 45
ctx.args.push_back(50); // the value of _2 is 50

// Create an arithmetic expression and immediately evaluate it
double d = proto::eval( (_2 - _1) / _2 * 100, ctx );

// This prints "10"
std::cout << d << std::endl;

Later, we'll see how to define more interesting evaluation contexts and expression transforms that give you total control over how your expressions are evaluated.

Customizing Expression Trees

Our calculator EDSL is already pretty useful, and for many EDSL scenarios, no more would be needed. But let's keep going. Imagine how much nicer it would be if all calculator expressions overloaded operator() so that they could be used as function objects. We can do that by creating a calculator domain and telling Proto that all expressions in the calculator domain have extra members. Here is how to define a calculator domain:

// Forward-declare an expression wrapper
template<typename Expr>
struct calculator;

// Define a calculator domain. Expression within
// the calculator domain will be wrapped in the
// calculator<> expression wrapper.
struct calculator_domain
  : proto::domain< proto::generator<calculator> >
{};

The calculator<> type will be an expression wrapper. It will behave just like the expression that it wraps, but it will have extra member functions that we will define. The calculator_domain is what informs Proto about our wrapper. It is used below in the definition of calculator<>. Read on for a description.

// Define a calculator expression wrapper. It behaves just like
// the expression it wraps, but with an extra operator() member
// function that evaluates the expression.    
template<typename Expr>
struct calculator
  : proto::extends<Expr, calculator<Expr>, calculator_domain>
{
    typedef
        proto::extends<Expr, calculator<Expr>, calculator_domain>
    base_type;

    calculator(Expr const &expr = Expr())
      : base_type(expr)
    {}

    typedef double result_type;

    // Overload operator() to invoke proto::eval() with
    // our calculator_context.
    double operator()(double a1 = 0, double a2 = 0) const
    {
        calculator_context ctx;
        ctx.args.push_back(a1);
        ctx.args.push_back(a2);

        return proto::eval(*this, ctx);
    }
};

The calculator<> struct is an expression extension. It uses proto::extends<> to effectively add additional members to an expression type. When composing larger expressions from smaller ones, Proto notes what domain the smaller expressions are in. The larger expression is in the same domain and is automatically wrapped in the domain's extension wrapper.

All that remains to be done is to put our placeholders in the calculator domain. We do that by wrapping them in our calculator<> wrapper, as below:

// Define the Protofied placeholder terminals, in the
// calculator domain.
calculator<proto::terminal<placeholder<0> >::type> const _1;
calculator<proto::terminal<placeholder<1> >::type> const _2;

Any larger expression that contain these placeholders will automatically be wrapped in the calculator<> wrapper and have our operator() overload. That means we can use them as function objects as follows.

double result = ((_2 - _1) / _2 * 100)(45.0, 50.0);
assert(result == (50.0 - 45.0) / 50.0 * 100));

Since calculator expressions are now valid function objects, we can use them with standard algorithms, as shown below:

double a1[4] = { 56, 84, 37, 69 };
double a2[4] = { 65, 120, 60, 70 };
double a3[4] = { 0 };

// Use std::transform() and a calculator expression
// to calculate percentages given two input sequences:
std::transform(a1, a1+4, a2, a3, (_2 - _1) / _2 * 100);

Now, let's use the calculator example to explore some other useful features of Proto.

Detecting Invalid Expressions

You may have noticed that you didn't have to define an overloaded operator-() or operator/() -- Proto defined them for you. In fact, Proto overloads all the operators for you, even though they may not mean anything in your domain-specific language. That means it may be possible to create expressions that are invalid in your domain. You can detect invalid expressions with Proto by defining the grammar of your domain-specific language.

For simplicity, assume that our calculator EDSL should only allow addition, subtraction, multiplication and division. Any expression involving any other operator is invalid. Using Proto, we can state this requirement by defining the grammar of the calculator EDSL. It looks as follows:

// Define the grammar of calculator expressions
struct calculator_grammar
  : proto::or_<
        proto::plus< calculator_grammar, calculator_grammar >
      , proto::minus< calculator_grammar, calculator_grammar >
      , proto::multiplies< calculator_grammar, calculator_grammar >
      , proto::divides< calculator_grammar, calculator_grammar >
      , proto::terminal< proto::_ >
    >
{};

You can read the above grammar as follows: an expression tree conforms to the calculator grammar if it is a binary plus, minus, multiplies or divides node, where both child nodes also conform to the calculator grammar; or if it is a terminal. In a Proto grammar, proto::_ is a wildcard that matches any type, so proto::terminal< proto::_ > matches any terminal, whether it is a placeholder or a literal.

[Note] Note

This grammar is actually a little looser than we would like. Only placeholders and literals that are convertible to doubles are valid terminals. Later on we'll see how to express things like that in Proto grammars.

Once you have defined the grammar of your EDSL, you can use the proto::matches<> metafunction to check whether a given expression type conforms to the grammar. For instance, we might add the following to our calculator::operator() overload:

template<typename Expr>
struct calculator
  : proto::extends< /* ... as before ... */ >
{
    /* ... */
    double operator()(double a1 = 0, double a2 = 0) const
    {
        // Check here that the expression we are about to
        // evaluate actually conforms to the calculator grammar.
        BOOST_MPL_ASSERT((proto::matches<Expr, calculator_grammar>));
        /* ... */
    }
};

The addition of the BOOST_MPL_ASSERT() line enforces at compile time that we only evaluate expressions that conform to the calculator EDSL's grammar. With Proto grammars, proto::matches<> and BOOST_MPL_ASSERT() it is very easy to give the users of your EDSL short and readable compile-time errors when they accidentally misuse your EDSL.

[Note] Note

BOOST_MPL_ASSERT() is part of the Boost Metaprogramming Library. To use it, just #include <boost/mpl/assert.hpp>.

Controlling Operator Overloads

Grammars and proto::matches<> make it possible to detect when a user has created an invalid expression and issue a compile-time error. But what if you want to prevent users from creating invalid expressions in the first place? By using grammars and domains together, you can disable any of Proto's operator overloads that would create an invalid expression. It is as simple as specifying the EDSL's grammar when you define the domain, as shown below:

// Define a calculator domain. Expression within
// the calculator domain will be wrapped in the
// calculator<> expression wrapper.
// NEW: Any operator overloads that would create an
//      expression that does not conform to the
//      calculator grammar is automatically disabled.
struct calculator_domain
  : proto::domain< proto::generator<calculator>, calculator_grammar >
{};

The only thing we changed is we added calculator_grammar as the second template parameter to the proto::domain<> template when defining calculator_domain. With this simple addition, we disable any of Proto's operator overloads that would create an invalid calculator expression.

... And Much More

Hopefully, this gives you an idea of what sorts of things Proto can do for you. But this only scratches the surface. The rest of this users' guide will describe all these features and others in more detail.

Happy metaprogramming!

Here is the fun part: designing your own mini-programming language. In this section we'll talk about the nuts and bolts of designing an EDSL interface using Proto. We'll cover the definition of terminals and lazy functions that the users of your EDSL will get to program with. We'll also talk about Proto's expression template-building operator overloads, and about ways to add additional members to expressions within your domain.

As we saw with the Calculator example from the Introduction, the simplest way to get an EDSL up and running is simply to define some terminals, as follows.

// Define a literal integer Proto expression.
proto::terminal<int>::type i = {0};

// This creates an expression template.
i + 1;

With some terminals and Proto's operator overloads, you can immediately start creating expression templates.

Defining terminals -- with aggregate initialization -- can be a little awkward at times. Proto provides an easier-to-use wrapper for literals that can be used to construct Protofied terminal expressions. It's called proto::literal<>.

// Define a literal integer Proto expression.
proto::literal<int> i = 0;

// Proto literals are really just Proto terminal expressions.
// For example, this builds a Proto expression template:
i + 1;

There is also a proto::lit() function for constructing a proto::literal<> in-place. The above expression can simply be written as:

// proto::lit(0) creates an integer terminal expression
proto::lit(0) + 1;

Once we have some Proto terminals, expressions involving those terminals build expression trees for us. Proto defines overloads for each of C++'s overloadable operators in the boost::proto namespace. As long as one operand is a Proto expression, the result of the operation is a tree node representing that operation.

[Note] Note

Proto's operator overloads live in the boost::proto namespace and are found via ADL (argument-dependent lookup). That is why expressions must be "tainted" with Proto-ness for Proto to be able to build trees out of expressions.

As a result of Proto's operator overloads, we can say:

-_1;        // OK, build a unary-negate tree node
_1 + 42;    // OK, build a binary-plus tree node

For the most part, this Just Works and you don't need to think about it, but a few operators are special and it can be helpful to know how Proto handles them.

Assignment, Subscript, and Function Call Operators

Proto also overloads operator=, operator[], and operator(), but these operators are member functions of the expression template rather than free functions in Proto's namespace. The following are valid Proto expressions:

_1 = 5;     // OK, builds a binary assign tree node
_1[6];      // OK, builds a binary subscript tree node
_1();       // OK, builds a unary function tree node
_1(7);      // OK, builds a binary function tree node
_1(8,9);    // OK, builds a ternary function tree node
// ... etc.

For the first two lines, assignment and subscript, it should be fairly unsurprising that the resulting expression node should be binary. After all, there are two operands in each expression. It may be surprising at first that what appears to be a function call with no arguments, _1(), actually creates an expression node with one child. The child is _1 itself. Likewise, the expression _1(7) has two children: _1 and 7.

Because these operators can only be defined as member functions, the following expressions are invalid:

int i;
i = _1;         // ERROR: cannot assign _1 to an int

int *p;
p[_1];          // ERROR: cannot use _1 as an index

std::sin(_1);   // ERROR: cannot call std::sin() with _1

Also, C++ has special rules for overloads of operator-> that make it useless for building expression templates, so Proto does not overload it.

The Address-Of Operator

Proto overloads the address-of operator for expression types, so that the following code creates a new unary address-of tree node:

&_1;    // OK, creates a unary address-of tree node

It does not return the address of the _1 object. However, there is special code in Proto such that a unary address-of node is implicitly convertible to a pointer to its child. In other words, the following code works and does what you might expect, but not in the obvious way:

typedef
    proto::terminal< placeholder<0> >::type
_1_type;

_1_type const _1 = {{}};
_1_type const * p = &_1; // OK, &_1 implicitly converted

If we limited ourselves to nothing but terminals and operator overloads, our embedded domain-specific languages wouldn't be very expressive. Imagine that we wanted to extend our calculator EDSL with a full suite of math functions like sin() and pow() that we could invoke lazily as follows.

// A calculator expression that takes one argument
// and takes the sine of it.
sin(_1);

We would like the above to create an expression template representing a function invocation. When that expression is evaluated, it should cause the function to be invoked. (At least, that's the meaning of function invocation we'd like the calculator EDSL to have.) You can define sin quite simply as follows.

// "sin" is a Proto terminal containing a function pointer
proto::terminal< double(*)(double) >::type const sin = {&std::sin};

In the above, we define sin as a Proto terminal containing a pointer to the std::sin() function. Now we can use sin as a lazy function. The default_context that we saw in the Introduction knows how to evaluate lazy functions. Consider the following:

double pi = 3.1415926535;
proto::default_context ctx;
// Create a lazy "sin" invocation and immediately evaluate it
std::cout << proto::eval( sin(pi/2), ctx ) << std::endl;

The above code prints out:

1

I'm no expert at trigonometry, but that looks right to me.

We can write sin(pi/2) because the sin object, which is a Proto terminal, has an overloaded operator()() that builds a node representing a function call invocation. The actual type of sin(pi/2) is actually something like this:

// The type of the expression sin(pi/2):
proto::function<
    proto::terminal< double(*)(double) >::type const &
    proto::result_of::as_child< double const >::type
>::type

This type further expands to an unsightly node type with a tag type of proto::tag::function and two children: the first representing the function to be invoked, and the second representing the argument to the function. (Node tag types describe the operation that created the node. The difference between a + b and a - b is that the former has tag type proto::tag::plus and the latter has tag type proto::tag::minus. Tag types are pure compile-time information.)

[Note] Note

In the type computation above, proto::result_of::as_child<> is a metafunction that ensures its argument is a Proto expression type. If it isn't one already, it becomes a Proto terminal. We'll learn more about this metafunction, along with proto::as_child(), its runtime counterpart, later. For now, you can forget about it.

It is important to note that there is nothing special about terminals that contain function pointers. Any Proto expression has an overloaded function call operator. Consider:

// This compiles!
proto::lit(1)(2)(3,4)(5,6,7,8);

That may look strange at first. It creates an integer terminal with proto::lit(), and then invokes it like a function again and again. What does it mean? Who knows?! You get to decide when you define an evaluation context or a transform. But more on that later.

Making Lazy Functions, Continued

Now, what if we wanted to add a pow() function to our calculator EDSL that users could invoke as follows?

// A calculator expression that takes one argument
// and raises it to the 2nd power
pow< 2 >(_1);

The simple technique described above of making pow a terminal containing a function pointer doesn't work here. If pow is an object, then the expression pow< 2 >(_1) is not valid C++. (Well, technically it is; it means, pow less than 2, greater than (_1), which is nothing at all like what we want.) pow should be a real function template. But it must be an unusual function: one that returns an expression template.

With sin, we relied on Proto to provide an overloaded operator()() to build an expression node with tag type proto::tag::function for us. Now we'll need to do so ourselves. As before, the node will have two children: the function to invoke and the function's argument.

With sin, the function to invoke was a raw function pointer wrapped in a Proto terminal. In the case of pow, we want it to be a terminal containing TR1-style function object. This will allow us to parameterize the function on the exponent. Below is the implementation of a simple TR1-style wrapper for the std::pow function:

// Define a pow_fun function object
template< int Exp >
struct pow_fun
{
    typedef double result_type;

    double operator()(double d) const
    {
        return std::pow(d, Exp);
    }
};

Following the sin example, we want pow< 1 >( pi/2 ) to have a type like this:

// The type of the expression pow<1>(pi/2):
proto::function<
    proto::terminal< pow_fun<1> >::type
    proto::result_of::as_child< double const >::type
>::type

We could write a pow() function using code like this, but it's verbose and error prone; it's too easy to introduce subtle bugs by forgetting to call proto::as_child() where necessary, resulting in code that seems to work but sometimes doesn't. Proto provides a better way to construct expression nodes: proto::make_expr().

Lazy Functions Made Simple With make_expr()

Proto provides a helper for building expression templates called proto::make_expr(). We can concisely define the pow() function with it as below.

// Define a lazy pow() function for the calculator EDSL.
// Can be used as: pow< 2 >(_1)
template< int Exp, typename Arg >
typename proto::result_of::make_expr<
    proto::tag::function  // Tag type
  , pow_fun< Exp >        // First child (by value)
  , Arg const &           // Second child (by reference)
>::type const
pow(Arg const &arg)
{
    return proto::make_expr<proto::tag::function>(
        pow_fun<Exp>()    // First child (by value)
      , boost::ref(arg)   // Second child (by reference)
    );
}

There are some things to notice about the above code. We use proto::result_of::make_expr<> to calculate the return type. The first template parameter is the tag type for the expression node we're building -- in this case, proto::tag::function.

Subsequent template parameters to proto::result_of::make_expr<> represent child nodes. If a child type is not already a Proto expression, it is automatically made into a terminal with proto::as_child(). A type such as pow_fun<Exp> results in terminal that is held by value, whereas a type like Arg const & (note the reference) indicates that the result should be held by reference.

In the function body is the runtime invocation of proto::make_expr(). It closely mirrors the return type calculation. proto::make_expr() requires you to specify the node's tag type as a template parameter. The arguments to the function become the node's children. When a child should be stored by value, nothing special needs to be done. When a child should be stored by reference, you must use the boost::ref() function to wrap the argument.

And that's it! proto::make_expr() is the lazy person's way to make a lazy funtion.

In this section, we'll learn all about domains. In particular, we'll learn:

  • How to associate Proto expressions with a domain,
  • How to add members to expressions within a domain,
  • How to use a generator to post-process all new expressions created in your domain,
  • How to control which operators are overloaded in a domain,
  • How to specify capturing policies for child expressions and non-Proto objects, and
  • How to make expressions from separate domains interoperate.

In the Hello Calculator section, we looked into making calculator expressions directly usable as lambda expressions in calls to STL algorithms, as below:

double data[] = {1., 2., 3., 4.};

// Use the calculator EDSL to square each element ... HOW?
std::transform( data, data + 4, data, _1 * _1 );

The difficulty, if you recall, was that by default Proto expressions don't have interesting behaviors of their own. They're just trees. In particular, the expression _1 * _1 won't have an operator() that takes a double and returns a double like std::transform() expects -- unless we give it one. To make this work, we needed to define an expression wrapper type that defined the operator() member function, and we needed to associate the wrapper with the calculator domain.

In Proto, the term domain refers to a type that associates expressions in that domain to an expression generator. The generator is just a function object that accepts an expression and does something to it, like wrapping it in an expression wrapper.

You can also use a domain to associate expressions with a grammar. When you specify a domain's grammar, Proto ensures that all the expressions it generates in that domain conform to the domain's grammar. It does that by disabling any operator overloads that would create invalid expressions.

The first step to giving your calculator expressions extra behaviors is to define a calculator domain. All expressions within the calculator domain will be imbued with calculator-ness, as we'll see.

// A type to be used as a domain tag (to be defined below)
struct calculator_domain;

We use this domain type when extending the proto::expr<> type, which we do with the proto::extends<> class template. Here is our expression wrapper, which imbues an expression with calculator-ness. It is described below.

// The calculator<> expression wrapper makes expressions
// function objects.
template< typename Expr >
struct calculator
  : proto::extends< Expr, calculator< Expr >, calculator_domain >
{
    typedef
        proto::extends< Expr, calculator< Expr >, calculator_domain >
    base_type;

    calculator( Expr const &expr = Expr() )
      : base_type( expr )
    {}

    // This is usually needed because by default, the compiler-
    // generated assignment operator hides extends<>::operator=
    BOOST_PROTO_EXTENDS_USING_ASSIGN(calculator)

    typedef double result_type;

    // Hide base_type::operator() by defining our own which
    // evaluates the calculator expression with a calculator context.
    result_type operator()( double d1 = 0.0, double d2 = 0.0 ) const
    {
        // As defined in the Hello Calculator section.
        calculator_context ctx;

        // ctx.args is a vector<double> that holds the values
        // with which we replace the placeholders (e.g., _1 and _2)
        // in the expression.
        ctx.args.push_back( d1 ); // _1 gets the value of d1
        ctx.args.push_back( d2 ); // _2 gets the value of d2

        return proto::eval(*this, ctx ); // evaluate the expression
    }
};

We want calculator expressions to be function objects, so we have to define an operator() that takes and returns doubles. The calculator<> wrapper above does that with the help of the proto::extends<> template. The first template to proto::extends<> parameter is the expression type we are extending. The second is the type of the wrapped expression. The third parameter is the domain that this wrapper is associated with. A wrapper type like calculator<> that inherits from proto::extends<> behaves just like the expression type it has extended, with any additional behaviors you choose to give it.

[Note] Note

Why not just inherit from proto::expr<>?

You might be thinking that this expression extension business is unnecessarily complicated. After all, isn't this why C++ supports inheritance? Why can't calculator<Expr> just inherit from Expr directly? The reason is because Expr, which presumably is an instantiation of proto::expr<>, has expression template-building operator overloads that will be incorrect for derived types. They will store *this by reference to proto::expr<>, effectively slicing off any derived parts. proto::extends<> gives your derived types operator overloads that don't slice off your additional members.

Although not strictly necessary in this case, we bring extends<>::operator= into scope with the BOOST_PROTO_EXTENDS_USING_ASSIGN() macro. This is really only necessary if you want expressions like _1 = 3 to create a lazily evaluated assignment. proto::extends<> defines the appropriate operator= for you, but the compiler-generated calculator<>::operator= will hide it unless you make it available with the macro.

Note that in the implementation of calculator<>::operator(), we evaluate the expression with the calculator_context we defined earlier. As we saw before, the context is what gives the operators their meaning. In the case of the calculator, the context is also what defines the meaning of the placeholder terminals.

Now that we have defined the calculator<> expression wrapper, we need to wrap the placeholders to imbue them with calculator-ness:

calculator< proto::terminal< placeholder<0> >::type > const _1;
calculator< proto::terminal< placeholder<1> >::type > const _2;
Retaining POD-ness with BOOST_PROTO_EXTENDS()

To use proto::extends<>, your extension type must derive from proto::extends<>. Unfortunately, that means that your extension type is no longer POD and its instances cannot be statically initialized. (See the Static Initialization section in the Rationale appendix for why this matters.) In particular, as defined above, the global placeholder objects _1 and _2 will need to be initialized at runtime, which could lead to subtle order of initialization bugs.

There is another way to make an expression extension that doesn't sacrifice POD-ness : the BOOST_PROTO_EXTENDS() macro. You can use it much like you use proto::extends<>. We can use BOOST_PROTO_EXTENDS() to keep calculator<> a POD and our placeholders statically initialized.

// The calculator<> expression wrapper makes expressions
// function objects.
template< typename Expr >
struct calculator
{
    // Use BOOST_PROTO_EXTENDS() instead of proto::extends<> to
    // make this type a Proto expression extension.
    BOOST_PROTO_EXTENDS(Expr, calculator<Expr>, calculator_domain)

    typedef double result_type;

    result_type operator()( double d1 = 0.0, double d2 = 0.0 ) const
    {
        /* ... as before ... */
    }
};

With the new calculator<> type, we can redefine our placeholders to be statically initialized:

calculator< proto::terminal< placeholder<0> >::type > const _1 = {{{}}};
calculator< proto::terminal< placeholder<1> >::type > const _2 = {{{}}};

We need to make one additional small change to accommodate the POD-ness of our expression extension, which we'll describe below in the section on expression generators.

What does BOOST_PROTO_EXTENDS() do? It defines a data member of the expression type being extended; some nested typedefs that Proto requires; operator=, operator[] and operator() overloads for building expression templates; and a nested result<> template for calculating the return type of operator(). In this case, however, the operator() overloads and the result<> template are not needed because we are defining our own operator() in the calculator<> type. Proto provides additional macros for finer control over which member functions are defined. We could improve our calculator<> type as follows:

// The calculator<> expression wrapper makes expressions
// function objects.
template< typename Expr >
struct calculator
{
    // Use BOOST_PROTO_BASIC_EXTENDS() instead of proto::extends<> to
    // make this type a Proto expression extension:
    BOOST_PROTO_BASIC_EXTENDS(Expr, calculator<Expr>, calculator_domain)

    // Define operator[] to build expression templates:
    BOOST_PROTO_EXTENDS_SUBSCRIPT()

    // Define operator= to build expression templates:
    BOOST_PROTO_EXTENDS_ASSIGN()

    typedef double result_type;

    result_type operator()( double d1 = 0.0, double d2 = 0.0 ) const
    {
        /* ... as before ... */
    }
};

Notice that we are now using BOOST_PROTO_BASIC_EXTENDS() instead of BOOST_PROTO_EXTENDS(). This just adds the data member and the nested typedefs but not any of the overloaded operators. Those are added separately with BOOST_PROTO_EXTENDS_ASSIGN() and BOOST_PROTO_EXTENDS_SUBSCRIPT(). We are leaving out the function call operator and the nested result<> template that could have been defined with Proto's BOOST_PROTO_EXTENDS_FUNCTION() macro.

In summary, here are the macros you can use to define expression extensions, and a brief description of each.

Table 31.2. Expression Extension Macros

Macro

Purpose

BOOST_PROTO_BASIC_EXTENDS(
    expression
  , extension
  , domain
)

Defines a data member of type expression and some nested typedefs that Proto requires.

BOOST_PROTO_EXTENDS_ASSIGN()

Defines operator=. Only valid when preceded by BOOST_PROTO_BASIC_EXTENDS().

BOOST_PROTO_EXTENDS_SUBSCRIPT()

Defines operator[]. Only valid when preceded by BOOST_PROTO_BASIC_EXTENDS().

BOOST_PROTO_EXTENDS_FUNCTION()

Defines operator() and a nested result<> template for return type calculation. Only valid when preceded by BOOST_PROTO_BASIC_EXTENDS().

BOOST_PROTO_EXTENDS(
    expression
  , extension
  , domain
)

Equivalent to:

BOOST_PROTO_BASIC_EXTENDS(expression, extension, domain)

  BOOST_PROTO_EXTENDS_ASSIGN()

  BOOST_PROTO_EXTENDS_SUBSCRIPT()

  BOOST_PROTO_EXTENDS_FUNCTION()


[Warning] Warning

Argument-Dependent Lookup and BOOST_PROTO_EXTENDS()

Proto's operator overloads are defined in the boost::proto namespace and are found by argument-dependent lookup (ADL). This usually just works because expressions are made up of types that live in the boost::proto namespace. However, sometimes when you use BOOST_PROTO_EXTENDS() that is not the case. Consider:

template<class T>
struct my_complex
{
    BOOST_PROTO_EXTENDS(
        typename proto::terminal<std::complex<T> >::type
      , my_complex<T>
      , proto::default_domain
    )
};

int main()
{
    my_complex<int> c0, c1;

    c0 + c1; // ERROR: operator+ not found
}

The problem has to do with how argument-dependent lookup works. The type my_complex<int> is not associated in any way with the boost::proto namespace, so the operators defined there are not considered. (Had we inherited from proto::extends<> instead of used BOOST_PROTO_EXTENDS(), we would have avoided the problem because inheriting from a type in boost::proto namespace is enough to get ADL to kick in.)

So what can we do? By adding an extra dummy template parameter that defaults to a type in the boost::proto namespace, we can trick ADL into finding the right operator overloads. The solution looks like this:

template<class T, class Dummy = proto::is_proto_expr>
struct my_complex
{
    BOOST_PROTO_EXTENDS(
        typename proto::terminal<std::complex<T> >::type
      , my_complex<T>
      , proto::default_domain
    )
};

int main()
{
    my_complex<int> c0, c1;

    c0 + c1; // OK, operator+ found now!
}

The type proto::is_proto_expr is nothing but an empty struct, but by making it a template parameter we make boost::proto an associated namespace of my_complex<int>. Now ADL can successfully find Proto's operator overloads.

The last thing that remains to be done is to tell Proto that it needs to wrap all of our calculator expressions in our calculator<> wrapper. We have already wrapped the placeholders, but we want all expressions that involve the calculator placeholders to be calculators. We can do that by specifying an expression generator when we define our calculator_domain, as follows:

// Define the calculator_domain we forward-declared above.
// Specify that all expression in this domain should be wrapped
// in the calculator<> expression wrapper.
struct calculator_domain
  : proto::domain< proto::generator< calculator > >
{};

The first template parameter to proto::domain<> is the generator. "Generator" is just a fancy name for a function object that accepts an expression and does something to it. proto::generator<> is a very simple one --- it wraps an expression in the wrapper you specify. proto::domain<> inherits from its generator parameter, so all domains are themselves function objects.

If we used BOOST_PROTO_EXTENDS() to keep our expression extension type POD, then we need to use proto::pod_generator<> instead of proto::generator<>, as follows:

// If calculator<> uses BOOST_PROTO_EXTENDS() instead of 
// use proto::extends<>, use proto::pod_generator<> instead
// of proto::generator<>.
struct calculator_domain
  : proto::domain< proto::pod_generator< calculator > >
{};

After Proto has calculated a new expression type, it checks the domains of the child expressions. They must match. Assuming they do, Proto creates the new expression and passes it to Domain::operator() for any additional processing. If we don't specify a generator, the new expression gets passed through unchanged. But since we've specified a generator above, calculator_domain::operator() returns calculator<> objects.

Now we can use calculator expressions as function objects to STL algorithms, as follows:

double data[] = {1., 2., 3., 4.};

// Use the calculator EDSL to square each element ... WORKS! :-)
std::transform( data, data + 4, data, _1 * _1 );

By default, Proto defines every possible operator overload for Protofied expressions. This makes it simple to bang together an EDSL. In some cases, however, the presence of Proto's promiscuous overloads can lead to confusion or worse. When that happens, you'll have to disable some of Proto's overloaded operators. That is done by defining the grammar for your domain and specifying it as the second parameter of the proto::domain<> template.

In the Hello Calculator section, we saw an example of a Proto grammar, which is repeated here:

// Define the grammar of calculator expressions
struct calculator_grammar
  : proto::or_<
        proto::plus< calculator_grammar, calculator_grammar >
      , proto::minus< calculator_grammar, calculator_grammar >
      , proto::multiplies< calculator_grammar, calculator_grammar >
      , proto::divides< calculator_grammar, calculator_grammar >
      , proto::terminal< proto::_ >
    >
{};

We'll have much more to say about grammars in subsequent sections, but for now, we'll just say that the calculator_grammar struct describes a subset of all expression types -- the subset that comprise valid calculator expressions. We would like to prohibit Proto from creating a calculator expression that does not conform to this grammar. We do that by changing the definition of the calculator_domain struct.

// Define the calculator_domain. Expressions in the calculator
// domain are wrapped in the calculator<> wrapper, and they must
// conform to the calculator_grammar:
struct calculator_domain
  : proto::domain< proto::generator< calculator >, calculator_grammar  >
{};

The only new addition is calculator_grammar as the second template parameter to the proto::domain<> template. That has the effect of disabling any of Proto's operator overloads that would create an invalid calculator expression.

Another common use for this feature would be to disable Proto's unary operator& overload. It may be surprising for users of your EDSL that they cannot take the address of their expressions! You can very easily disable Proto's unary operator& overload for your domain with a very simple grammar, as below:

// For expressions in my_domain, disable Proto's
// unary address-of operator.
struct my_domain
  : proto::domain<
        proto::generator< my_wrapper >
        // A simple grammar that matches any expression that
        // is not a unary address-of expression.
      , proto::not_< proto::address_of< _ > >
    >
{};

The type proto::not_< proto::address_of< _ > > is a very simple grammar that matches all expressions except unary address-of expressions. In the section describing Proto's intermediate form, we'll have much more to say about grammars.

[Note] Note

This is an advanced topic. Feel free to skip this if you're just getting started with Proto.

Proto's operator overloads build expressions from sub-expressions. The sub-expressions become children of the new expression. By default, the children are stored in the parent by reference. This section describes how to change that default.

Primer: as_child vs. as_expr

Proto lets you independently customize the behavior of proto::as_child() and proto::as_expr(). Both accept an object x and return a Proto expression by turning x it into a Proto terminal if necessary. Although similar, the two functions are used in different situations and have subtly different behavior by default. It's important to understand the difference so that you know which to customize to achieve the behavior you want.

To wit: proto::as_expr() is typically used by you to turn an object into a Proto expression that is to be held in a local variable, as so:

auto l = proto::as_expr(x); // Turn x into a Proto expression, hold the result in a local

The above works regardless of whether x is already a Proto expression or not. The object l is guaranteed to be a valid Proto expression. If x is a non-Proto object, it is turned into a terminal expression that holds x by value.[30] If x is a Proto object already, proto::as_expr() returns it by value unmodified.

In contrast, proto::as_child() is used internally by Proto to pre-process objects before making them children of another expression. Since it's internal to Proto, you don't see it explicitly, but it's there behind the scenes in expressions like this:

x + y; // Consider that y is a Proto expression, but x may or may not be.

In this case, Proto builds a plus node from the two children. Both are pre-processed by passing them to proto::as_child() before making them children of the new node. If x is not a Proto expression, it becomes one by being wrapped in a Proto terminal that holds it by reference. If x is already a Proto expression, proto::as_child() returns it by reference unmodified. Contrast this with the above description for proto::as_expr().

The table below summarizes the above description.

Table 31.3. proto::as_expr() vs. proto::as_child()

Function

When t is not a Proto expr...

When t is a Proto expr...

proto::as_expr(t)

Return (by value) a new Proto terminal holding t by value.

Return t by value unmodified.

proto::as_child(t)

Return (by value) a new Proto terminal holding t by reference.

Return t by reference unmodified.


[Note] Note

There is one important place where Proto uses both as_expr and as_child: proto::make_expr(). The proto::make_expr() function requires you to specify for each child whether it should be held by value or by reference. Proto uses proto::as_expr() to pre-process the children to be held by value, and proto::as_child() for the ones to be held by reference.

Now that you know what proto::as_child() and proto::as_expr() are, where they are used, and what they do by default, you may decide that one or both of these functions should have different behavior for your domain. For instance, given the above description of proto::as_child(), the following code is always wrong:

proto::literal<int> i(0);
auto l = i + 42; // This is WRONG! Don't do this.

Why is this wrong? Because proto::as_child() will turn the integer literal 42 into a Proto terminal that holds a reference to a temporary integer initialized with 42. The lifetime of that temporary ends at the semicolon, guaranteeing that the local l is left holding a dangling reference to a deceased integer. What to do? One answer is to use proto::deep_copy(). Another is to customize the behavior of proto::as_child() for your domain. Read on for the details.

Per-Domain as_child

To control how Proto builds expressions out of sub-expressions in your domain, define your domain as usual, and then define a nested as_child<> class template within it, as follows:

class my_domain
  : proto::domain< my_generator, my_grammar >
{
    // Here is where you define how Proto should handle
    // sub-expressions that are about to be glommed into
    // a larger expression.
    template< typename T >
    struct as_child
    {
        typedef unspecified-Proto-expr-type result_type;

        result_type operator()( T & t ) const
        {
            return unspecified-Proto-expr-object;
        }
    };
};

There's one important thing to note: in the above code, the template parameter T may or may not be a Proto expression type, but the result must be a Proto expression type, or a reference to one. That means that most user-defined as_child<> templates will need to check whether T is an expression or not (using proto::is_expr<>), and then turn non-expressions into Proto terminals by wrapping them as proto::terminal< /* ... */ >::type or equivalent.

Per-Domain as_expr

Although less common, Proto also lets you customize the behavior of proto::as_expr() on a per-domain basis. The technique is identical to that for as_child. See below:

class my_domain
  : proto::domain< my_generator, my_grammar >
{
    // Here is where you define how Proto should handle
    // objects that are to be turned into expressions
    // fit for storage in local variables.
    template< typename T >
    struct as_expr
    {
        typedef unspecified-Proto-expr-type result_type;

        result_type operator()( T & t ) const
        {
            return unspecified-Proto-expr-object;
        }
    };
};
Making Proto Expressions auto-safe

Let's look again at the problem described above involving the C++11 auto keyword and the default behavior of proto::as_child().

proto::literal<int> i(0);
auto l = i + 42; // This is WRONG! Don't do this.

Recall that the problem is the lifetime of the temporary integer created to hold the value 42. The local l will be left holding a dangling reference to it after its lifetime is over. What if we want Proto to make expressions safe to store this way in local variables? We can do so very easily by making proto::as_child() behave just like proto::as_expr(). The following code achieves this:

template< typename E >
struct my_expr;

struct my_generator
  : proto::pod_generator< my_expr >
{};

struct my_domain
  : proto::domain< my_generator >
{
     // Make as_child() behave like as_expr() in my_domain.
     // (proto_base_domain is a typedef for proto::domain< my_generator >
     // that is defined in proto::domain<>.)
     template< typename T >
     struct as_child
       : proto_base_domain::as_expr< T >
     {};
};

template< typename E >
struct my_expr
{
    BOOST_PROTO_EXTENDS( E, my_expr< E >, my_domain )
};

/* ... */

proto::literal< int, my_domain > i(0);
auto l = i + 42; // OK! Everything is stored by value here.

Notice that my_domain::as_child<> simply defers to the default implementation of as_expr<> found in proto::domain<>. By simply cross-wiring our domain's as_child<> to as_expr<>, we guarantee that all terminals that can be held by value are, and that all child expressions are also held by value. This increases copying and may incur a runtime performance cost, but it eliminates any spector of lifetime management issues.

For another example, see the definition of lldomain in libs/proto/example/lambda.hpp. That example is a complete reimplementation of the Boost Lambda Library (BLL) on top of Boost.Proto. The function objects the BLL generates are safe to be stored in local variables. To emulate this with Proto, the lldomain cross-wires as_child<> to as_expr<> as above, but with one extra twist: objects with array type are also stored by reference. Check it out.

[Note] Note

This is an advanced topic. Feel free to skip this if you're just getting started with Proto.

The ability to compose different EDSLs is one of their most exciting features. Consider how you build a parser using yacc. You write your grammar rules in yacc's domain-specific language. Then you embed semantic actions written in C within your grammar. Boost's Spirit parser generator gives you the same ability. You write grammar rules using Spirit.Qi and embed semantic actions using the Phoenix library. Phoenix and Spirit are both Proto-based domain-specific languages with their own distinct syntax and semantics. But you can freely embed Phoenix expressions within Spirit expressions. This section describes Proto's sub-domain feature that lets you define families of interoperable domains.

Dueling Domains

When you try to create an expression from two sub-expressions in different domains, what is the domain of the resulting expression? This is the fundamental problem that is addressed by sub-domains. Consider the following code:

#include <boost/proto/proto.hpp>
namespace proto = boost::proto;

// Forward-declare two expression wrappers
template<typename E> struct spirit_expr;
template<typename E> struct phoenix_expr;

// Define two domains
struct spirit_domain  : proto::domain<proto::generator<spirit_expr> > {};
struct phoenix_domain : proto::domain<proto::generator<phoenix_expr> > {};

// Implement the two expression wrappers
template<typename E>
struct spirit_expr
  : proto::extends<E, spirit_expr<E>, spirit_domain>
{
    spirit_expr(E const &e = E()) : spirit_expr::proto_extends(e) {}
};

template<typename E>
struct phoenix_expr
  : proto::extends<E, phoenix_expr<E>, phoenix_domain>
{
    phoenix_expr(E const &e = E()) : phoenix_expr::proto_extends(e) {}
};

int main()
{
    proto::literal<int, spirit_domain> sp(0);
    proto::literal<int, phoenix_domain> phx(0);

    // Whoops! What does it mean to add two expressions in different domains?
    sp + phx; // ERROR
}

Above, we define two domains called spirit_domain and phoenix_domain and declare two int literals in each. Then we try to compose them into a larger expression using Proto's binary plus operator, and it fails. Proto can't figure out whether the resulting expression should be in the Spirit domain or the Phoenix domain, and thus whether it should be an instance of spirit_expr<> or phoenix_expr<>. We have to tell Proto how to resolve the conflict. We can do that by declaring that Phoenix is a sub-domain of Spirit as in the following definition of phoenix_domain:

// Declare that phoenix_domain is a sub-domain of spirit_domain
struct phoenix_domain
  : proto::domain<proto::generator<phoenix_expr>, proto::_, spirit_domain>
{};

The third template parameter to proto::domain<> is the super-domain. By defining phoenix_domain as above, we are saying that Phoenix expressions can be combined with Spirit expressions, and that when that happens, the resulting expression should be a Spirit expression.

[Note] Note

If you are wondering what the purpose of proto::_ is in the definition of phoenix_domain above, recall that the second template parameter to proto::domain<> is the domain's grammar. proto::_ is the default and signifies that the domain places no restrictions on the expressions that are valid within it.

Domain Resolution

When there are multiple domains in play within a given expression, Proto uses some rules to figure out which domain "wins". The rules are loosely modeled on the rules for C++ inheritance. Phoenix_domain is a sub-domain of spirit_domain. You can liken that to a derived/base relationship that gives Phoenix expressions a kind of implicit conversion to Spirit expressions. And since Phoenix expressions can be "converted" to Spirit expressions, they can be freely combined with Spirit expressions and the result is a Spirit expression.

[Note] Note

Super- and sub-domains are not actually implemented using inheritance. This is only a helpful mental model.

The analogy with inheritance holds even in the case of three domains when two are sub-domains of the third. Imagine another domain called foobar_domain that was also a sub-domain of spirit_domain. Expressions in the foobar_domain could be combined with expressions in the phoenix_domain and the resulting expression would be in the spirit_domain. That's because expressions in the two sub-domains both have "conversions" to the super-domain, so the operation is allowed and the super-domain wins.

The Default Domain

When you don't assign a Proto expression to a particular domain, Proto considers it a member of the so-called default domain, proto::default_domain. Even non-Proto objects are treated as terminals in the default domain. Consider:

int main()
{
    proto::literal<int, spirit_domain> sp(0);

    // Add 1 to a spirit expression. Result is a spirit expression.
    sp + 1;
}

Expressions in the default domain (or non-expressions like 1) have a kind of implicit conversion to expressions every other domain type. What's more, you can define your domain to be a sub-domain of the default domain. In so doing, you give expressions in your domain conversions to expressions in every other domain. This is like a free love domain, because it will freely mix with all other domains.

Let's think again about the Phoenix EDSL. Since it provides generally useful lambda functionality, it's reasonable to assume that lots of other EDSLs besides Spirit might want the ability to embed Phoenix expressions. In other words, phoenix_domain should be a sub-domain of proto::default_domain, not spirit_domain:

// Declare that phoenix_domain is a sub-domain of proto::default_domain
struct phoenix_domain
  : proto::domain<proto::generator<phoenix_expr>, proto::_, proto::default_domain>
{};

That's much better. Phoenix expressions can now be put anywhere.

Sub-Domain Summary

Use Proto sub-domains to make it possible to mix expressions from multiple domains. And when you want expressions in your domain to freely combine with all expressions, make it a sub-domain of proto::default_domain.

The preceding discussions of defining Proto front ends have all made a big assumption: that you have the luxury of defining everything from scratch. What happens if you have existing types, say a matrix type and a vector type, that you would like to treat as if they were Proto terminals? Proto usually trades only in its own expression types, but with BOOST_PROTO_DEFINE_OPERATORS(), it can accomodate your custom terminal types, too.

Let's say, for instance, that you have the following types and that you can't modify then to make them native Proto terminal types.

namespace math
{
    // A matrix type ...
    struct matrix { /*...*/ };

    // A vector type ...
    struct vector { /*...*/ };
}

You can non-intrusively make objects of these types Proto terminals by defining the proper operator overloads using BOOST_PROTO_DEFINE_OPERATORS(). The basic procedure is as follows:

  1. Define a trait that returns true for your types and false for all others.
  2. Reopen the namespace of your types and use BOOST_PROTO_DEFINE_OPERATORS() to define a set of operator overloads, passing the name of the trait as the first macro parameter, and the name of a Proto domain (e.g., proto::default_domain) as the second.

The following code demonstrates how it works.

namespace math
{
    template<typename T>
    struct is_terminal
      : mpl::false_
    {};

    // OK, "matrix" is a custom terminal type
    template<>
    struct is_terminal<matrix>
      : mpl::true_
    {};

    // OK, "vector" is a custom terminal type
    template<>
    struct is_terminal<vector>
      : mpl::true_
    {};

    // Define all the operator overloads to construct Proto
    // expression templates, treating "matrix" and "vector"
    // objects as if they were Proto terminals.
    BOOST_PROTO_DEFINE_OPERATORS(is_terminal, proto::default_domain)
}

The invocation of the BOOST_PROTO_DEFINE_OPERATORS() macro defines a complete set of operator overloads that treat matrix and vector objects as if they were Proto terminals. And since the operators are defined in the same namespace as the matrix and vector types, the operators will be found by argument-dependent lookup. With the code above, we can now construct expression templates with matrices and vectors, as shown below.

math::matrix m1;
math::vector v1;
proto::literal<int> i(0);

m1 * 1;  // custom terminal and literals are OK
m1 * i;  // custom terminal and Proto expressions are OK
m1 * v1; // two custom terminals are OK, too.

Sometimes as an EDSL designer, to make the lives of your users easy, you have to make your own life hard. Giving your users natural and flexible syntax often involves writing large numbers of repetitive function overloads. It can be enough to give you repetitive stress injury! Before you hurt yourself, check out the macros Proto provides for automating many repetitive code-generation chores.

Imagine that we are writing a lambda EDSL, and we would like to enable syntax for constructing temporary objects of any type using the following syntax:

// A lambda expression that takes two arguments and
// uses them to construct a temporary std::complex<>
construct< std::complex<int> >( _1, _2 )

For the sake of the discussion, imagine that we already have a function object template construct_impl<> that accepts arguments and constructs new objects from them. We would want the above lambda expression to be equivalent to the following:

// The above lambda expression should be roughly equivalent
// to the following:
proto::make_expr<proto::tag::function>(
    construct_impl<std::complex<int> >() // The function to invoke lazily
  , boost::ref(_1)                       // The first argument to the function
  , boost::ref(_2)                       // The second argument to the function
);

We can define our construct() function template as follows:

template<typename T, typename A0, typename A1>
typename proto::result_of::make_expr<
    proto::tag::function
  , construct_impl<T>
  , A0 const &
  , A1 const &
>::type const
construct(A0 const &a0, A1 const &a1)
{
    return proto::make_expr<proto::tag::function>(
        construct_impl<T>()
      , boost::ref(a0)
      , boost::ref(a1)
    );
}

This works for two arguments, but we would like it to work for any number of arguments, up to ( BOOST_PROTO_MAX_ARITY - 1). (Why "- 1"? Because one child is taken up by the construct_impl<T>() terminal leaving room for only ( BOOST_PROTO_MAX_ARITY - 1) other children.)

For cases like this, Proto provides the BOOST_PROTO_REPEAT() and BOOST_PROTO_REPEAT_FROM_TO() macros. To use it, we turn the function definition above into a macro as follows:

#define M0(N, typename_A, A_const_ref, A_const_ref_a, ref_a)  \
template<typename T, typename_A(N)>                           \
typename proto::result_of::make_expr<                         \
    proto::tag::function                                      \
  , construct_impl<T>                                         \
  , A_const_ref(N)                                            \
>::type const                                                 \
construct(A_const_ref_a(N))                                   \
{                                                             \
    return proto::make_expr<proto::tag::function>(            \
        construct_impl<T>()                                   \
      , ref_a(N)                                              \
    );                                                        \
}

Notice that we turned the function into a macro that takes 5 arguments. The first is the current iteration number. The rest are the names of other macros that generate different sequences. For instance, Proto passes as the second parameter the name of a macro that will expand to typename A0, typename A1, ....

Now that we have turned our function into a macro, we can pass the macro to BOOST_PROTO_REPEAT_FROM_TO(). Proto will invoke it iteratively, generating all the function overloads for us.

// Generate overloads of construct() that accept from
// 1 to BOOST_PROTO_MAX_ARITY-1 arguments:
BOOST_PROTO_REPEAT_FROM_TO(1, BOOST_PROTO_MAX_ARITY, M0)
#undef M0
Non-Default Sequences

As mentioned above, Proto passes as the last 4 arguments to your macro the names of other macros that generate various sequences. The macros BOOST_PROTO_REPEAT() and BOOST_PROTO_REPEAT_FROM_TO() select defaults for these parameters. If the defaults do not meet your needs, you can use BOOST_PROTO_REPEAT_EX() and BOOST_PROTO_REPEAT_FROM_TO_EX() and pass different macros that generate different sequences. Proto defines a number of such macros for use as parameters to BOOST_PROTO_REPEAT_EX() and BOOST_PROTO_REPEAT_FROM_TO_EX(). Check the reference section for boost/proto/repeat.hpp for all the details.

Also, check out BOOST_PROTO_LOCAL_ITERATE(). It works similarly to BOOST_PROTO_REPEAT() and friends, but it can be easier to use when you want to change one macro argument and accept defaults for the others.

By now, you know a bit about how to build a front-end for your EDSL "compiler" -- you can define terminals and functions that generate expression templates. But we haven't said anything about the expression templates themselves. What do they look like? What can you do with them? In this section we'll see.

The expr<> Type

All Proto expressions are an instantiation of a template called proto::expr<> (or a wrapper around such an instantiation). When we define a terminal as below, we are really initializing an instance of the proto::expr<> template.

// Define a placeholder type
template<int I>
struct placeholder
{};

// Define the Protofied placeholder terminal
proto::terminal< placeholder<0> >::type const _1 = {{}};

The actual type of _1 looks like this:

proto::expr< proto::tag::terminal, proto::term< placeholder<0> >, 0 >

The proto::expr<> template is the most important type in Proto. Although you will rarely need to deal with it directly, it's always there behind the scenes holding your expression trees together. In fact, proto::expr<> is the expression tree -- branches, leaves and all.

The proto::expr<> template makes up the nodes in expression trees. The first template parameter is the node type; in this case, proto::tag::terminal. That means that _1 is a leaf-node in the expression tree. The second template parameter is a list of child types, or in the case of terminals, the terminal's value type. Terminals will always have only one type in the type list. The last parameter is the arity of the expression. Terminals have arity 0, unary expressions have arity 1, etc.

The proto::expr<> struct is defined as follows:

template< typename Tag, typename Args, long Arity = Args::arity >
struct expr;

template< typename Tag, typename Args >
struct expr< Tag, Args, 1 >
{
    typedef typename Args::child0 proto_child0;
    proto_child0 child0;
    // ...
};

The proto::expr<> struct does not define a constructor, or anything else that would prevent static initialization. All proto::expr<> objects are initialized using aggregate initialization, with curly braces. In our example, _1 is initialized with the initializer {{}}. The outer braces are the initializer for the proto::expr<> struct, and the inner braces are for the member _1.child0 which is of type placeholder<0>. Note that we use braces to initialize _1.child0 because placeholder<0> is also an aggregate.

Building Expression Trees

The _1 node is an instantiation of proto::expr<>, and expressions containing _1 are also instantiations of proto::expr<>. To use Proto effectively, you won't have to bother yourself with the actual types that Proto generates. These are details, but you're likely to encounter these types in compiler error messages, so it's helpful to be familiar with them. The types look like this:

// The type of the expression -_1
typedef
    proto::expr<
        proto::tag::negate
      , proto::list1<
            proto::expr<
                proto::tag::terminal
              , proto::term< placeholder<0> >
              , 0
            > const &
        >
      , 1
    >
negate_placeholder_type;

negate_placeholder_type x = -_1;

// The type of the expression _1 + 42
typedef
    proto::expr<
        proto::tag::plus
      , proto::list2<
            proto::expr<
                proto::tag::terminal
              , proto::term< placeholder<0> >
              , 0
            > const &
          , proto::expr<
                proto::tag::terminal
              , proto::term< int const & >
              , 0
            >
        >
      , 2
    >
placeholder_plus_int_type;

placeholder_plus_int_type y = _1 + 42;

There are a few things to note about these types:

  • Terminals have arity zero, unary expressions have arity one and binary expressions have arity two.
  • When one Proto expression is made a child node of another Proto expression, it is held by reference, even if it is a temporary object. This last point becomes important later.
  • Non-Proto expressions, such as the integer literal, are turned into Proto expressions by wrapping them in new expr<> terminal objects. These new wrappers are not themselves held by reference, but the object wrapped is. Notice that the type of the Protofied 42 literal is int const & -- held by reference.

The types make it clear: everything in a Proto expression tree is held by reference. That means that building an expression tree is exceptionally cheap. It involves no copying at all.

[Note] Note

An astute reader will notice that the object y defined above will be left holding a dangling reference to a temporary int. In the sorts of high-performance applications Proto addresses, it is typical to build and evaluate an expression tree before any temporary objects go out of scope, so this dangling reference situation often doesn't arise, but it is certainly something to be aware of. Proto provides utilities for deep-copying expression trees so they can be passed around as value types without concern for dangling references.

After assembling an expression into a tree, you'll naturally want to be able to do the reverse, and access a node's children. You may even want to be able to iterate over the children with algorithms from the Boost.Fusion library. This section shows how.

Getting Expression Tags and Arities

Every node in an expression tree has both a tag type that describes the node, and an arity corresponding to the number of child nodes it has. You can use the proto::tag_of<> and proto::arity_of<> metafunctions to fetch them. Consider the following:

template<typename Expr>
void check_plus_node(Expr const &)
{
    // Assert that the tag type is proto::tag::plus
    BOOST_STATIC_ASSERT((
        boost::is_same<
            typename proto::tag_of<Expr>::type
          , proto::tag::plus
        >::value
    ));

    // Assert that the arity is 2
    BOOST_STATIC_ASSERT( proto::arity_of<Expr>::value == 2 );
}

// Create a binary plus node and use check_plus_node()
// to verify its tag type and arity:
check_plus_node( proto::lit(1) + 2 );

For a given type Expr, you could access the tag and arity directly as Expr::proto_tag and Expr::proto_arity, where Expr::proto_arity is an MPL Integral Constant.

Getting Terminal Values

There is no simpler expression than a terminal, and no more basic operation than extracting its value. As we've already seen, that is what proto::value() is for.

proto::terminal< std::ostream & >::type cout_ = {std::cout};

// Get the value of the cout_ terminal:
std::ostream & sout = proto::value( cout_ );

// Assert that we got back what we put in:
assert( &sout == &std::cout );

To compute the return type of the proto::value() function, you can use proto::result_of::value<>. When the parameter to proto::result_of::value<> is a non-reference type, the result type of the metafunction is the type of the value as suitable for storage by value; that is, top-level reference and qualifiers are stripped from it. But when instantiated with a reference type, the result type has a reference added to it, yielding a type suitable for storage by reference. If you want to know the actual type of the terminal's value including whether it is stored by value or reference, you can use fusion::result_of::value_at<Expr, 0>::type.

The following table summarizes the above paragraph.

Table 31.4. Accessing Value Types

Metafunction Invocation

When the Value Type Is ...

The Result Is ...

proto::result_of::value<Expr>::type

T

typename boost::remove_const<
    typename boost::remove_reference<T>::type
>::type [a]

proto::result_of::value<Expr &>::type

T

typename boost::add_reference<T>::type

proto::result_of::value<Expr const &>::type

T

typename boost::add_reference<
    typename boost::add_const<T>::type
>::type

fusion::result_of::value_at<Expr, 0>::type

T

T

[a] If T is a reference-to-function type, then the result type is simply T.


Getting Child Expressions

Each non-terminal node in an expression tree corresponds to an operator in an expression, and the children correspond to the operands, or arguments of the operator. To access them, you can use the proto::child_c() function template, as demonstrated below:

proto::terminal<int>::type i = {42};

// Get the 0-th operand of an addition operation:
proto::terminal<int>::type &ri = proto::child_c<0>( i + 2 );

// Assert that we got back what we put in:
assert( &i == &ri );

You can use the proto::result_of::child_c<> metafunction to get the type of the Nth child of an expression node. Usually you don't care to know whether a child is stored by value or by reference, so when you ask for the type of the Nth child of an expression Expr (where Expr is not a reference type), you get the child's type after references and cv-qualifiers have been stripped from it.

template<typename Expr>
void test_result_of_child_c(Expr const &expr)
{
    typedef typename proto::result_of::child_c<Expr, 0>::type type;

    // Since Expr is not a reference type,
    // result_of::child_c<Expr, 0>::type is a
    // non-cv qualified, non-reference type:
    BOOST_MPL_ASSERT((
        boost::is_same< type, proto::terminal<int>::type >
    ));
}

// ...
proto::terminal<int>::type i = {42};
test_result_of_child_c( i + 2 );

However, if you ask for the type of the Nth child of Expr & or Expr const & (note the reference), the result type will be a reference, regardless of whether the child is actually stored by reference or not. If you need to know exactly how the child is stored in the node, whether by reference or by value, you can use fusion::result_of::value_at<Expr, N>::type. The following table summarizes the behavior of the proto::result_of::child_c<> metafunction.

Table 31.5. Accessing Child Types

Metafunction Invocation

When the Child Is ...

The Result Is ...

proto::result_of::child_c<Expr, N>::type

T

typename boost::remove_const<
    typename boost::remove_reference<T>::type
>::type

proto::result_of::child_c<Expr &, N>::type

T

typename boost::add_reference<T>::type

proto::result_of::child_c<Expr const &, N>::type

T

typename boost::add_reference<
    typename boost::add_const<T>::type
>::type

fusion::result_of::value_at<Expr, N>::type

T

T


Common Shortcuts

Most operators in C++ are unary or binary, so accessing the only operand, or the left and right operands, are very common operations. For this reason, Proto provides the proto::child(), proto::left(), and proto::right() functions. proto::child() and proto::left() are synonymous with proto::child_c<0>(), and proto::right() is synonymous with proto::child_c<1>().

There are also proto::result_of::child<>, proto::result_of::left<>, and proto::result_of::right<> metafunctions that merely forward to their proto::result_of::child_c<> counterparts.

When you build an expression template with Proto, all the intermediate child nodes are held by reference. The avoids needless copies, which is crucial if you want your EDSL to perform well at runtime. Naturally, there is a danger if the temporary objects go out of scope before you try to evaluate your expression template. This is especially a problem in C++0x with the new decltype and auto keywords. Consider:

// OOPS: "ex" is left holding dangling references
auto ex = proto::lit(1) + 2;

The problem can happen in today's C++ also if you use BOOST_TYPEOF() or BOOST_AUTO(), or if you try to pass an expression template outside the scope of its constituents.

In these cases, you want to deep-copy your expression template so that all intermediate nodes and the terminals are held by value. That way, you can safely assign the expression template to a local variable or return it from a function without worrying about dangling references. You can do this with proto::deep_copy() as fo llows:

// OK, "ex" has no dangling references
auto ex = proto::deep_copy( proto::lit(1) + 2 );

If you are using Boost.Typeof, it would look like this:

// OK, use BOOST_AUTO() and proto::deep_copy() to
// store an expression template in a local variable 
BOOST_AUTO( ex, proto::deep_copy( proto::lit(1) + 2 ) );

For the above code to work, you must include the boost/proto/proto_typeof.hpp header, which also defines the BOOST_PROTO_AUTO() macro which automatically deep-copies its argument. With BOOST_PROTO_AUTO(), the above code can be writen as:

// OK, BOOST_PROTO_AUTO() automatically deep-copies
// its argument: 
BOOST_PROTO_AUTO( ex, proto::lit(1) + 2 );

When deep-copying an expression tree, all intermediate nodes and all terminals are stored by value. The only exception is terminals that are function references, which are left alone.

[Note] Note

proto::deep_copy() makes no exception for arrays, which it stores by value. That can potentially cause a large amount of data to be copied.

Proto provides a utility for pretty-printing expression trees that comes in very handy when you're trying to debug your EDSL. It's called proto::display_expr(), and you pass it the expression to print and optionally, an std::ostream to which to send the output. Consider:

// Use display_expr() to pretty-print an expression tree
proto::display_expr(
    proto::lit("hello") + 42
);

The above code writes this to std::cout:

plus(
    terminal(hello)
  , terminal(42)
)

In order to call proto::display_expr(), all the terminals in the expression must be Streamable (that is, they can be written to a std::ostream). In addition, the tag types must all be Streamable as well. Here is an example that includes a custom terminal type and a custom tag:

// A custom tag type that is Streamable
struct MyTag
{
    friend std::ostream &operator<<(std::ostream &s, MyTag)
    {
        return s << "MyTag";
    }
};

// Some other Streamable type
struct MyTerminal
{
    friend std::ostream &operator<<(std::ostream &s, MyTerminal)
    {
        return s << "MyTerminal";
    }
};

int main()
{
    // Display an expression tree that contains a custom
    // tag and a user-defined type in a terminal
    proto::display_expr(
        proto::make_expr<MyTag>(MyTerminal()) + 42
    );
}

The above code prints the following:

plus(
    MyTag(
        terminal(MyTerminal)
    )
  , terminal(42)
)

The following table lists the overloadable C++ operators, the Proto tag types for each, and the name of the metafunctions for generating the corresponding Proto expression types. And as we'll see later, the metafunctions are also usable as grammars for matching such nodes, as well as pass-through transforms.

Table 31.6. Operators, Tags and Metafunctions

Operator

Proto Tag

Proto Metafunction

unary +

proto::tag::unary_plus

proto::unary_plus<>

unary -

proto::tag::negate

proto::negate<>

unary *

proto::tag::dereference

proto::dereference<>

unary ~

proto::tag::complement

proto::complement<>

unary &

proto::tag::address_of

proto::address_of<>

unary !

proto::tag::logical_not

proto::logical_not<>

unary prefix ++

proto::tag::pre_inc

proto::pre_inc<>

unary prefix --

proto::tag::pre_dec

proto::pre_dec<>

unary postfix ++

proto::tag::post_inc

proto::post_inc<>

unary postfix --

proto::tag::post_dec

proto::post_dec<>

binary <<

proto::tag::shift_left

proto::shift_left<>

binary >>

proto::tag::shift_right

proto::shift_right<>

binary *

proto::tag::multiplies

proto::multiplies<>

binary /

proto::tag::divides

proto::divides<>

binary %

proto::tag::modulus

proto::modulus<>

binary +

proto::tag::plus

proto::plus<>

binary -

proto::tag::minus

proto::minus<>

binary <

proto::tag::less

proto::less<>

binary >

proto::tag::greater

proto::greater<>

binary <=

proto::tag::less_equal

proto::less_equal<>

binary >=

proto::tag::greater_equal

proto::greater_equal<>

binary ==

proto::tag::equal_to

proto::equal_to<>

binary !=

proto::tag::not_equal_to

proto::not_equal_to<>

binary ||

proto::tag::logical_or

proto::logical_or<>

binary &&

proto::tag::logical_and

proto::logical_and<>

binary &

proto::tag::bitwise_and

proto::bitwise_and<>

binary |

proto::tag::bitwise_or

proto::bitwise_or<>

binary ^

proto::tag::bitwise_xor

proto::bitwise_xor<>

binary ,

proto::tag::comma

proto::comma<>

binary ->*

proto::tag::mem_ptr

proto::mem_ptr<>

binary =

proto::tag::assign

proto::assign<>

binary <<=

proto::tag::shift_left_assign

proto::shift_left_assign<>

binary >>=

proto::tag::shift_right_assign

proto::shift_right_assign<>

binary *=

proto::tag::multiplies_assign

proto::multiplies_assign<>

binary /=

proto::tag::divides_assign

proto::divides_assign<>

binary %=

proto::tag::modulus_assign

proto::modulus_assign<>

binary +=

proto::tag::plus_assign

proto::plus_assign<>

binary -=

proto::tag::minus_assign

proto::minus_assign<>

binary &=

proto::tag::bitwise_and_assign

proto::bitwise_and_assign<>

binary |=

proto::tag::bitwise_or_assign

proto::bitwise_or_assign<>

binary ^=

proto::tag::bitwise_xor_assign

proto::bitwise_xor_assign<>

binary subscript

proto::tag::subscript

proto::subscript<>

ternary ?:

proto::tag::if_else_

proto::if_else_<>

n-ary function call

proto::tag::function

proto::function<>


Boost.Fusion is a library of iterators, algorithms, containers and adaptors for manipulating heterogeneous sequences. In essence, a Proto expression is just a heterogeneous sequence of its child expressions, and so Proto expressions are valid Fusion random-access sequences. That means you can apply Fusion algorithms to them, transform them, apply Fusion filters and views to them, and access their elements using fusion::at(). The things Fusion can do to heterogeneous sequences are beyond the scope of this users' guide, but below is a simple example. It takes a lazy function invocation like fun(1,2,3,4) and uses Fusion to print the function arguments in order.

struct display
{
    template<typename T>
    void operator()(T const &t) const
    {
        std::cout << t << std::endl;
    }
};

struct fun_t {};
proto::terminal<fun_t>::type const fun = {{}};

// ...
fusion::for_each(
    fusion::transform(
        // pop_front() removes the "fun" child
        fusion::pop_front(fun(1,2,3,4))
        // Extract the ints from the terminal nodes
      , proto::functional::value()
    )
  , display()
);

Recall from the Introduction that types in the proto::functional namespace define function objects that correspond to Proto's free functions. So proto::functional::value() creates a function object that is equivalent to the proto::value() function. The above invocation of fusion::for_each() displays the following:

1
2
3
4

Terminals are also valid Fusion sequences. They contain exactly one element: their value.

Flattening Proto Expression Tress

Imagine a slight variation of the above example where, instead of iterating over the arguments of a lazy function invocation, we would like to iterate over the terminals in an addition expression:

proto::terminal<int>::type const _1 = {1};

// ERROR: this doesn't work! Why?
fusion::for_each(
    fusion::transform(
        _1 + 2 + 3 + 4
      , proto::functional::value()
    )
  , display()
);

The reason this doesn't work is because the expression _1 + 2 + 3 + 4 does not describe a flat sequence of terminals --- it describes a binary tree. We can treat it as a flat sequence of terminals, however, using Proto's proto::flatten() function. proto::flatten() returns a view which makes a tree appear as a flat Fusion sequence. If the top-most node has a tag type T, then the elements of the flattened sequence are the child nodes that do not have tag type T. This process is evaluated recursively. So the above can correctly be written as:

proto::terminal<int>::type const _1 = {1};

// OK, iterate over a flattened view
fusion::for_each(
    fusion::transform(
        proto::flatten(_1 + 2 + 3 + 4)
      , proto::functional::value()
    )
  , display()
);

The above invocation of fusion::for_each() displays the following:

1
2
3
4

Expression trees can have a very rich and complicated structure. Often, you need to know some things about an expression's structure before you can process it. This section describes the tools Proto provides for peering inside an expression tree and discovering its structure. And as you'll see in later sections, all the really interesting things you can do with Proto begin right here.

Imagine your EDSL is a miniature I/O facility, with iostream operations that execute lazily. You might want expressions representing input operations to be processed by one function, and output operations to be processed by a different function. How would you do that?

The answer is to write patterns (a.k.a, grammars) that match the structure of input and output expressions. Proto provides utilities for defining the grammars, and the proto::matches<> template for checking whether a given expression type matches the grammar.

First, let's define some terminals we can use in our lazy I/O expressions:

proto::terminal< std::istream & >::type cin_ = { std::cin };
proto::terminal< std::ostream & >::type cout_ = { std::cout };

Now, we can use cout_ instead of std::cout, and get I/O expression trees that we can execute later. To define grammars that match input and output expressions of the form cin_ >> i and cout_ << 1 we do this:

struct Input
  : proto::shift_right< proto::terminal< std::istream & >, proto::_ >
{};

struct Output
  : proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
{};

We've seen the template proto::terminal<> before, but here we're using it without accessing the nested ::type. When used like this, it is a very simple grammar, as are proto::shift_right<> and proto::shift_left<>. The newcomer here is _ in the proto namespace. It is a wildcard that matches anything. The Input struct is a grammar that matches any right-shift expression that has a std::istream terminal as its left operand.

We can use these grammars together with the proto::matches<> template to query at compile time whether a given I/O expression type is an input or output operation. Consider the following:

template< typename Expr >
void input_output( Expr const & expr )
{
    if( proto::matches< Expr, Input >::value )
    {
        std::cout << "Input!\n";
    }

    if( proto::matches< Expr, Output >::value )
    {
        std::cout << "Output!\n";
    }
}

int main()
{
    int i = 0;
    input_output( cout_ << 1 );
    input_output( cin_ >> i );

    return 0;
}

This program prints the following:

Output!
Input!

If we wanted to break the input_output() function into two functions, one that handles input expressions and one for output expressions, we can use boost::enable_if<>, as follows:

template< typename Expr >
typename boost::enable_if< proto::matches< Expr, Input > >::type
input_output( Expr const & expr )
{
    std::cout << "Input!\n";
}

template< typename Expr >
typename boost::enable_if< proto::matches< Expr, Output > >::type
input_output( Expr const & expr )
{
    std::cout << "Output!\n";
}

This works as the previous version did. However, the following does not compile at all:

input_output( cout_ << 1 << 2 ); // oops!

What's wrong? The problem is that this expression does not match our grammar. The expression groups as if it were written like (cout_ << 1) << 2. It will not match the Output grammar, which expects the left operand to be a terminal, not another left-shift operation. We need to fix the grammar.

We notice that in order to verify an expression as input or output, we'll need to recurse down to the bottom-left-most leaf and check that it is a std::istream or std::ostream. When we get to the terminal, we must stop recursing. We can express this in our grammar using proto::or_<>. Here are the correct Input and Output grammars:

struct Input
  : proto::or_<
        proto::shift_right< proto::terminal< std::istream & >, proto::_ >
      , proto::shift_right< Input, proto::_ >
    >
{};

struct Output
  : proto::or_<
        proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
      , proto::shift_left< Output, proto::_ >
    >
{};

This may look a little odd at first. We seem to be defining the Input and Output types in terms of themselves. This is perfectly OK, actually. At the point in the grammar that the Input and Output types are being used, they are incomplete, but by the time we actually evaluate the grammar with proto::matches<>, the types will be complete. These are recursive grammars, and rightly so because they must match a recursive data structure!

Matching an expression such as cout_ << 1 << 2 against the Output grammar procedes as follows:

  1. The first alternate of the proto::or_<> is tried first. It will fail, because the expression cout_ << 1 << 2 does not match the grammar proto::shift_left< proto::terminal< std::ostream & >, proto::_ >.
  2. Then the second alternate is tried next. We match the expression against proto::shift_left< Output, proto::_ >. The expression is a left-shift, so we next try to match the operands.
  3. The right operand 2 matches proto::_ trivially.
  4. To see if the left operand cout_ << 1 matches Output, we must recursively evaluate the Output grammar. This time we succeed, because cout_ << 1 will match the first alternate of the proto::or_<>.

We're done -- the grammar matches successfully.

The terminals in an expression tree could be const or non-const references, or they might not be references at all. When writing grammars, you usually don't have to worry about it because proto::matches<> gives you a little wiggle room when matching terminals. A grammar such as proto::terminal<int> will match a terminal of type int, int &, or int const &.

You can explicitly specify that you want to match a reference type. If you do, the type must match exactly. For instance, a grammar such as proto::terminal<int &> will only match an int &. It will not match an int or an int const &.

The table below shows how Proto matches terminals. The simple rule is: if you want to match only reference types, you must specify the reference in your grammar. Otherwise, leave it off and Proto will ignore const and references.

Table 31.7. proto::matches<> and Reference / CV-Qualification of Terminals

Terminal

Grammar

Matches?

T

T

yes

T &

T

yes

T const &

T

yes

T

T &

no

T &

T &

yes

T const &

T &

no

T

T const &

no

T &

T const &

no

T const &

T const &

yes


This begs the question: What if you want to match an int, but not an int & or an int const &? For forcing exact matches, Proto provides the proto::exact<> template. For instance, proto::terminal< proto::exact<int> > would only match an int held by value.

Proto gives you extra wiggle room when matching array types. Array types match themselves or the pointer types they decay to. This is especially useful with character arrays. The type returned by proto::as_expr("hello") is proto::terminal<char const[6]>::type. That's a terminal containing a 6-element character array. Naturally, you can match this terminal with the grammar proto::terminal<char const[6]>, but the grammar proto::terminal<char const *> will match it as well, as the following code fragment illustrates.

struct CharString
  : proto::terminal< char const * >
{};

typedef proto::terminal< char const[6] >::type char_array;

BOOST_MPL_ASSERT(( proto::matches< char_array, CharString > ));

What if we only wanted CharString to match terminals of exactly the type char const *? You can use proto::exact<> here to turn off the fuzzy matching of terminals, as follows:

struct CharString
  : proto::terminal< proto::exact< char const * > >
{};

typedef proto::terminal<char const[6]>::type char_array;
typedef proto::terminal<char const *>::type  char_string;

BOOST_MPL_ASSERT(( proto::matches< char_string, CharString > ));
BOOST_MPL_ASSERT_NOT(( proto::matches< char_array, CharString > ));

Now, CharString does not match array types, only character string pointers.

The inverse problem is a little trickier: what if you wanted to match all character arrays, but not character pointers? As mentioned above, the expression as_expr("hello") has the type proto::terminal< char const[ 6 ] >::type. If you wanted to match character arrays of arbitrary size, you could use proto::N, which is an array-size wildcard. The following grammar would match any string literal: proto::terminal< char const[ proto::N ] >.

Sometimes you need even more wiggle room when matching terminals. For example, maybe you're building a calculator EDSL and you want to allow any terminals that are convertible to double. For that, Proto provides the proto::convertible_to<>