Traits: a new and useful template techniqueby Nathan C. Myers
inventing some novel techniques, one of which is the unexpectedly useful traits -- it radically simplifies the interface to class templates instantiable on native C++ types. Support for internationalization was one of the mandates to the ANSI/ISO C++ Standard Library working group at its inception. What this would mean wasn't clear at the time, and has only gradually become clear over the course of five years. One thing we have discovered it to mean is that any library facility which operates on characters must be parameterized on the character type, using templates. Parameterizing existing iostream and string classes on the character type turned out to be unexpectedly difficult. It required inventing a new technique, which has since been found to be unexpectedly useful in a variety of applications. The ProblemLet us begin with the problem: In the iostream library, the interface to streambuf (as in stdio before it) depends on a value of EOF which is distinct from all character values. In traditional libraries, therefore, the type of EOF was int, and the function that retrieves characters returned an int:
class streambuf { ... int sgetc(); // return the next character, or EOF. int sgetn(char*, int N); // get N characters. }; What happens when we parameterize streambuf on the character type? We need not only a type for the character, but for the type of the EOF value. Here's a start:
template <class charT, class intT> class basic_streambuf { ... intT sgetc(); int sgetn(charT*, int N); }; The extra template parameter clutters things up. Users of iostream don't care what the end-of-file mark is, or its type, and shouldn't need to care. Worse, what value should sgetc() return at end-of-file? Must this be another template parameter? The effort is getting out of hand. The "Traits" TechniqueThis is where the new technique comes in. Instead of accreting parameters to our original template, we can define another template. Because the user never mentions it, its name can be long and descriptive.
template <class charT> struct ios_char_traits { }; The default traits class template is empty; what can anyone say about an unknown character type? However, for real character types, we can specialize the template and provide useful semantics:
struct ios_char_traits<char> { typedef char char_type; typedef int int_type; static inline int_type eof() { return EOF; } }; Notice that ios_char_traits<char> has no data members; it only provides public definitions. Now we can define our streambuf template:
template <class charT> class basic_streambuf { public: typedef ios_char_traits<charT> traits_type; typedef traits_type::int_type int_type; int_type eof() { return traits_type::eof(); } ... int_type sgetc(); int sgetn(charT*, int N); }; Except for the typedefs, this looks much like the previous declaration. But notice that it only has one template parameter, the one that interests users. The compiler looks up information about the character type in the character's traits class. Code that uses the new template looks the same as before, except that some variables are declared differently. To put a new character type on a stream, we need only specialize ios_char_traits for the new type. For example, let's add support for wide characters:
struct ios_char_traits<wchar_t> { typedef wchar_t char_type; typedef wint_t int_type; static inline int_type eof() { return WEOF; } }; Strings may be generalized exactly the same way. This technique turns out to be useful anywhere that a template must be applied to native types, or to any type for which you cannot add members as required for the template's operations.
Another ExampleBefore elaborating on the technique, let us see how it might be applied elsewhere. This example is drawn from the ANSI/ISO C++ [Draft] Standard [although what ended up in the standard looks different]. First, imagine writing a numerical analysis library, that should work on float, double, and long double numeric types. Each type has a maximum exponent value, an "epsilon", a mantissa size, and so on. These parameters are all defined in the standard header file <float.h>, but a template parameterized on the numeric type doesn't know whether to refer to FLT_MAX_EXP or DBL_MAX_EXP. A traits template with specializations solves the problem cleanly:
template <class numT> struct float_traits { }; struct float_traits<float> { typedef float float_type; enum { max_exponent = FLT_MAX_EXP }; static inline float_type epsilon() { return FLT_EPSILON; } ... }; struct float_traits<double> { typedef double float_type; enum { max_exponent = DBL_MAX_EXP }; static inline float_type epsilon() { return DBL_EPSILON; } ... }; Now we can refer to "max_exponent" without knowing whether it is for a float, a double, or your own class type. Here's a matrix template, for instance:
template <class numT> class matrix { public: typedef numT num_type; typedef float_traits<num_type> traits_type; inline num_type epsilon() { return traits_type::epsilon(); } ... }; Notice that in all the examples thus far, each template provided public typedefs of its parameters, and also anything that depended on them. This is no accident: in a wide variety of situations, the parameters used to instantiate a template are not available, and can only be retrieved if provided as typedefs in the template declaration. The moral: always provide these typedefs.
Default Template ParametersThe examples above are about as far as we can go with 1993-vintage compilers. However, a minor extension approved at the meeting in November 1993, and already implemented in recent compiler releases from some vendors, allows us to go much further. The extension simply allows default parameters to templates. Some compilers have long supported numeric default template parameters. The syntax is obvious; the power it provides may not be. Here is an example drawn from Stroustrup's Design and Evolution of C++ (page 359). First, we assume a traits-like template CMP:
template <class T> class CMP { static bool eq(T a, T b) { return a == b; } static bool lt(T a, T b) { return a < b; } }; and an ordinary string template:
template <class charT> class basic_string; Now we can define a compare() function on such strings:
template <class charT, class C = CMP<charT> > int compare(const basic_string<charT>&, const basic_string<charT>&); I have omitted implementation details here, because I want to draw your attention to the parameters to compare<>(). First, notice that the second parameter, C, defaults not just to a class, but to an instantiated template class. Second, notice that the parameter to that template is the previous parameter! This would not be allowed in a function declaration, but it is explicitly legal for template parameters. This allows us to call compare() on two strings using the default definitions of eq() and lt(), or to substitute our own definitions (such as a case-insensitive comparison). We can do the same thing with our streambuf template:
template <class charT, class traits = ios_char_traits<charT> > class basic_streambuf { public: typedef traits traits_type; typedef traits_type::int_type int_type; int_type eof() { return traits_type::eof(); } ... int_type sgetc(); int sgetn(charT*, int N); }; This allows us to substitute different traits for a particular character type -- which may be important, if (for instance) the end-of-file mark value must be different for a different character set mapping. Note: the Standard does not actually provide this constructor.
Runtime-variable TraitsWe can generalize even further. We haven't seen the constructor for basic_streambuf yet:
template <class charT, class traits = ios_char_traits<charT> > class basic_streambuf { traits traits_; // member data ... public: basic_streambuf(const traits& b = traits()) : traits_(b) { ... } int_type eof() { return traits_.eof(); } }; By adding a default constructor parameter, we can use a traits template parameter that may vary not only at compile time, but at runtime. In this case, the call to "traits_.eof()" may call a static member function of traits, or a regular member function. A nonstatic member function can use values passed in from the constructor and saved. [This technique appears in the Draft in the use of allocator parameters to standard containers.] Notice that nothing has become harder to use, because the defaults result in traditional behavior; but when you need greater flexibility, you can have it. In every case you get optimal code -- the extra flexibility costs nothing at runtime unless it's used. SummaryThe traits technique is useful immediately, on any compiler that supports templates. It provides a convenient way to associate related types, values, and functions with a template parameter type without requiring that they be defined as members of the type. A simple language extension dramatically (and upward-compatibly) extends the technique to allow greater flexibility, even at runtime, at no cost in convenience or efficiency.
References
Post-publication notes:I have been asked for a short definition of a traits class.
I have also been asked about the origin of the name: The original name was baggage, which I still prefer. Now that compilers support default arguments, and people are using the technique, they are finding that the syntax used in the examples above often doesn't work. This is a result of a discovery by compiler implementers that parsing mentions of member typedefs, as above, is impossible to do efficiently; so the language was changed. Now, to mention member types of templates we must say "typename" in front of the type, for example: template <class charT, class traits = ios_char_traits<charT> > class basic_streambuf { public: typedef traits traits_type; typedef typename traits_type::int_type int_type; int_type eof() { return traits_type::eof(); } ... int_type sgetc(); int sgetn(charT*, int N); };If your compiler supports the typename keyword, you should have no more trouble with member type names in traits arguments. |
Return to the Cantrip Corpus.