读c++ primer有感--string源码中的模板编程示例

来源：互联网发布：蔡康永马东知乎编辑：程序博客网时间：2024/06/14 07:05

看string源码，第一个问题是传说中的模板编程，不过过会儿了解下后觉得但是处于读懂代码的目的的话，相应的模板编程的奇淫技巧也是能够被理解的。本文举例说明一下相关string源码中用到的技巧，当然，也希望通过对表象的解释对隐藏在其后的编译时多态能揭露一二。
本文的例子在源码中多多少少都会找到些例子，不那么空兄来风。
ps:知道点模板编程的东西，直接的一点好处是g++的时候面对一大陀不知所云的错误时多多少少能有点谱。

首先看下在我ubuntu下/usr/include/c++/4.8/bits/char_traits.h内的例子。

例一：对两个模板参数的__are__same
template<typename, typename>
    struct __are_same
    {
      enum { __value = 0 };
      typedef __false_type __type;
    };

template<typename _Tp>
    struct __are_same<_Tp, _Tp>
    {
      enum { __value = 1 };
      typedef __true_type __type;
    };
用法示例：
    std::cout<<std::__are_same<int,int>::__value<<std::endl;
    std::cout<<std::__are_same<int,long long>::__value<<std::endl;
输出1,0。
后者是前者的一个类偏特化，匹配优先级高。
c++提供了一个对外的版本std::is_same。
    std::cout<<std::is_same<int,int>::value<<std::endl;
    std::cout<<std::is_same<int,long long>::value<<std::endl;
输出类似。
问题来了，这东西有啥用？我举个例子。
需要提供一个mycopy函数从c数组arr1复制到arr2,数组元素类型不一定，这时可以这样写。
template<typename T1,typename T2>
void mycopy(T1 *arr1,T2 *arr2,size_t len){
    if(std::is_same<T1,T2>::value){
        std::cout<<"using memcpy"<<std::endl;
        memcpy(arr1,arr2,len * sizeof(T1));
    }else{
        std::cout<<"using index"<<std::endl;
        while(len-- > 0)
            arr1[len] = arr2[len];
    }
}
类型相同（需要底层const和volatile都一致）那么就memcpy，不同就递减索引赋值。
验证一下：
    int arr1[] = {1,2,3,4,5};
    int arr2[] = {2,3,4,5,6};
    short arr3[] = {2,3,4,5,6};
    std::for_each(arr1,arr1 + 5,[](int i){std::cout<<i<<std::endl;});
    mycopy(arr1,arr2,5);
    std::for_each(arr1,arr1 + 5,[](int i){std::cout<<i<<std::endl;});
    mycopy(arr1,arr3,5);
    std::for_each(arr1,arr1 + 5,[](int i){std::cout<<i<<std::endl;});
结果符合预期。
可以继续讨论：
1.mycopy一般应该采用提供更特化版本方法，这样不会产生一块永远不会用到的代码(if或者else中的)。这里更特化有一句解释“A template X is more specialized than a template Y if every argument list that matches the one specified by X also matches the one specified by Y, but not the other way around”。
template<typename T1,typename T2>
void mycopy(T1 *arr1,T2 *arr2,size_t len){
    std::cout<<"using index"<<std::endl;
    while(len-- > 0)
        arr1[len] = arr2[len];
}
template<typename T>
void mycopy(T *arr1,T *arr2,size_t len){
    std::cout<<"using memcpy"<<std::endl;
    memcpy(arr1,arr2,len * sizeof(T));
}
2.也不就是说此例is_same没用了，如果两个版本的mycopy逻辑大部分一样，甚至只有是否打印某一行特殊日志的区别，那么模板更特化就会出现大量的重复代码，不利于维护，影响可读性，这时可以用is_same。
总之，在这里__are_same和is_same可以使模板函数内根据模板参数是否一致而产生的分支成为可能。

例二：单参数模板类__is_integer
template<typename _Tp>
    struct __is_integer
    {
      enum { __value = 0 };
      typedef __false_type __type;
    };
template<>
    struct __is_integer<bool>
    {
      enum { __value = 1 };
      typedef __true_type __type;
    };
template<>
    struct __is_integer<short>
    {
      enum { __value = 1 };
      typedef __true_type __type;
    };
   ........此处略去大约二十处其余特化版本，意思就是除了我特化这些类型其他的__value都是0。
   再来个判断浮点数的：
template<typename _Tp>
    struct __is_floating
    {
      enum { __value = 0 };
      typedef __false_type __type;
    };
template<>
    struct __is_floating<float>
    {
      enum { __value = 1 };
      typedef __true_type __type;
    };

template<>
    struct __is_floating<double>
    {
      enum { __value = 1 };
      typedef __true_type __type;
    };
   ......略去另外二十处，意思是除了特化的这些如float double类型其他的__value都是0。
   如果要实现一个加法的函数模板myplus，对于整形直接返回加法结果，浮点数因为可能存在小数点后损失要先打印一行warning提醒可能的损失，其他的类型就直接编译不过，可能实现如下。
   首先得用到一个定义在/usr/include/c++/4.8/ext/type_traits.h里的：
template<bool, typename>
    struct __enable_if
    { };

template<typename _Tp>
    struct __enable_if<true, _Tp>
    { typedef _Tp __type; };
   代码如下：
template<typename T>
T myplus(T lhs,T rhs){
      typedef typename __gnu_cxx::__enable_if<std::__is_floating<T>::__value || std::__is_integer<T>::__value,T>::__type temp_type;
      if(std::__is_floating<T>::__value)
          std::cout<<"warning: adding two floating num may result in jingdusunshi\n";
      return lhs + rhs;
}
   测试代码：
   std::cout<<myplus(i,j)<<std::endl;
    2
   std::cout<<myplus(1.0,2.0)<<std::endl;
    warning: adding two floating num may result in jingdusunshi
    3
   std::cout<<myplus(std::string("haha"),std::string("xixi"))<<std::endl;
   test.cpp:49:117: error: no type named ‘__type’ in ‘struct __gnu_cxx::__enable_if<false, std::basic_string<char> >’
   typedef typename __gnu_cxx::__enable_if<std::__is_floating<T>::__value || std::__is_integer<T>::__value,T>::__type temp_type;
   这是符合预期的，下面我们看看发生了什么。
   首先T的推断没有问题，一个int，一个double，一个std::string(aka std::basic_string<char,char_trait<char>,allocator<char>> ~~xixi)。
   接下来对于或操作std::__is_floating<T>::__value || std::__is_integer<T>::__value，根据__is_floating和__is_interger的定义，只有浮点数或者整形的特化版本里面枚举值type才被定义为非零。所以只要是这两者或语句返回true，其他的是false。
   再根据enable_if的定义，如果非模板参数bool结果是false，那么__type是没有定义的，true则有定义（这里随便给了个temp_type，反正也是没有用的）。
   这样对于非整形非浮点的string，typedef typename __gnu_cxx::__enable_if<std::__is_floating<T>::__value || std::__is_integer<T>::__value,T>::__type temp_type 这个语句会报错，因为__enable_if里面没有__type的typedef，从而达到了控制模板实例化类型的目的。
   总之，stl源码里面一大串__is_XXX的特化模板类，其中定义了取值为0或1的枚举值__value，以及__type，除了像__are_same一样控制程序分支，还能起到对用户实例化模板类或模板函数选择的类型起到限制作用，从而防止用户不明不白地滥用该模板。

例三 __iterator_traits
   定义在/usr/include/c++/4.8/bits/stl_iterator_base_types.h 内，这个有点绕，容我慢慢道来。
   首先是__has_iterator_category，这个是用宏_GLIBCXX_HAS_NESTED_TYPE(iterator_category)来生成代码的。_GLIBCXX_HAS_NESTED_TYPE的定义：
#define _GLIBCXX_HAS_NESTED_TYPE(_NTYPE)                         \
template<typename _Tp>                                         \
    class __has_##_NTYPE##_helper                                \
    : __sfinae_types                                             \
    {                                                            \
      template<typename _Up>                                     \
        struct _Wrap_type                                        \
    { };                                                     \
                                                                 \
      template<typename _Up>                                     \
        static __one __test(_Wrap_type<typename _Up::_NTYPE>*); \
                                                                 \
      template<typename _Up>                                     \
        static __two __test(...);                                \
                                                                 \
    public:                                                      \
      static constexpr bool value = sizeof(__test<_Tp>(0)) == 1; \
    };                                                           \
                                                                 \
template<typename _Tp>                                         \
    struct __has_##_NTYPE                                        \
    : integral_constant<bool, __has_##_NTYPE##_helper            \
            <typename remove_cv<_Tp>::type>::value> \
    { };

_GLIBCXX_END_NAMESPACE_VERSION
   第一次看到这个还是比较凌乱的感觉，逐步分析吧。
   __has_iterator_category使用场景：__has_iterator_category<_Iterator>::value; //_Iterator如果是类，并且内部定义了iterator_category，那么value为true，否则为false。
   预处理替换和模板实例化以后：struct __has_iterator_type<_Iterator>:integral_constant<bool, __has_iterator_type_helper<typename remove_cv<_Iterator>::type>::value>{};
   不含__前缀的remove_cv是stl提供给外界使用的，cv是const+volatile，为了简化讨论，假设我们的类型_Iterator就是个裸的。
   struct __has_iterator_type<_Iterator>:integral_constant<bool, __has_iterator_type_helper<_Iterator>::value>{}; //简化后remove_cv后
   __has_iterator_type_helper<_Iterator>就是上面宏定义的内容，继承了__sfinae_types：
struct __sfinae_types
{
    typedef char __one;
    typedef struct { char __arr[2]; } __two;
};
   不用被sfinae怪异的名字吓到，只用记住sizeof(__sfinae_types::__one) == 1，sizeof(__sfinae_types::__two) != 1就可以了。（sfinae貌似用途挺多的，有兴趣可以自行拓展）
   __has_iterator_type_helper<_Iterator>::value等于(sizeof(__test<_Tp>(0)) == 1)，__test<_Iterator>(0)有两个重载版本（应该是函数重载而不是模板特化），并且貌似可变长参数(...)版本优先级低（这个c++ primer竟然没写，但测试结果是）。那么只要非可变长版本合法，那么就轮不到可变长了。
   __one __test(_Wrap_type<typename _Iterator::iterator_category>*);只要这个函数合法，就是说只要_Iterator内部有iterator_category的定义 => 函数合法 => 返回值类型是__one => __one的sizeof等于1 => value等于true，否则value为false。
   那么__has_iterator_type_helper<_Iterator>翻译过来就是：如果_Iterator内部定义了iterator_type那么value为true，否则为false。
   原struct再次简化为，struct __has_iterator_type<_Iterator>:integral_constant<bool,true或者false>{}，intergral_constant没做多少操作，bool被定义成了value_type，true或者false被定义成了value，最终要使用的也就是这个value。
   再看iterator_traits的定义：
template<typename _Iterator,
       bool = __has_iterator_category<_Iterator>::value>
    struct __iterator_traits { };

template<typename _Iterator>
    struct __iterator_traits<_Iterator, true>
    {
      typedef typename _Iterator::iterator_category iterator_category;
      typedef typename _Iterator::value_type        value_type;
      typedef typename _Iterator::difference_type   difference_type;
      typedef typename _Iterator::pointer           pointer;
      typedef typename _Iterator::reference         reference;
    };
template<typename _Tp>
    struct iterator_traits<_Tp*>
    {
      typedef random_access_iterator_tag iterator_category;
      typedef _Tp                         value_type;
      typedef ptrdiff_t                   difference_type;
      typedef _Tp*                        pointer;
      typedef _Tp&                        reference;
    };
template<typename _Tp>
    struct iterator_traits<const _Tp*>
    {
      typedef random_access_iterator_tag iterator_category;
      typedef _Tp                         value_type;
      typedef ptrdiff_t                   difference_type;
      typedef const _Tp*                  pointer;
      typedef const _Tp&                  reference;
    };
   当我们写下如下语句：__iterator_traits<char *>,__iterator_traits<const char *>,__iterator_traits<std::iterator<std::random_access_iterator_tag,char>>,__iterator_traits<std::vector<int>::const_iterator>、__iterator_traits<char>时分别匹配第三、第四、第二、第二、第一个模板。
   模板参数取char *时按照最特化的规则匹配第三个，_Tp为char，iterator_category是random_access_iterator_tag，随机存储型的，支持operator+(int)操作，这个tag可以用于函数的重载。value_type自然就是char了，不过从后面的const _Tp*版本的看来value_type是去除了const属性的“纯”类型。加上diffrence_type、pointer、reference这五个类里面的typedef拓展了光杆司令char* 和 const char *的相关类型。
   但是对于std::iterator和std::vector<int>::const_iterator这种来说，先看下定义：
template<typename _Category, typename _Tp, typename _Distance = ptrdiff_t,
           typename _Pointer = _Tp*, typename _Reference = _Tp&>
    struct iterator
    {
      /// One of the @link iterator_tags tag types@endlink.
      typedef _Category iterator_category;
      /// The type "pointed to" by the iterator.
      typedef _Tp        value_type;
      /// Distance between iterators is represented as this type.
      typedef _Distance difference_type;
      /// This type represents a pointer-to-value_type.
      typedef _Pointer   pointer;
      /// This type represents a reference-to-value_type.
      typedef _Reference reference;
    };
   呵呵，std::iterator<std::random_access_iterator_tag,char>本来就有这五个typedef，再typedef一次貌似多余？
   其实不多余，并且这是泛型算法的关键点。如最基础的std::advance
template<typename _InputIterator, typename _Distance>
    inline void
    advance(_InputIterator& __i, _Distance __n)
    {
      typename iterator_traits<_InputIterator>::difference_type __d = __n;
      std::__advance(__i, __d, std::__iterator_category(__i));
    }
   这里先把_Distance转化成difference_type，可能会感到有啥区别？比如对于单向的迭代器如单向链表，传一个int时负数的距离显然是没有意义的，那么difference_type可以设置为unsigned，尽管这不是什么好的容错处理。
   之后实际调用了std::_advance(__i,__d,iterator_traits<_InputIterator>::iterator_category());如前所述，iterator_category用于了函数重载：
template<typename _InputIterator, typename _Distance>
    inline void
    __advance(_InputIterator& __i, _Distance __n, input_iterator_tag)
    {
      _GLIBCXX_DEBUG_ASSERT(__n >= 0);
      while (__n--)
    ++__i;
    }
template<typename _BidirectionalIterator, typename _Distance>
    inline void
    __advance(_BidirectionalIterator& __i, _Distance __n,
          bidirectional_iterator_tag)
    {
      if (__n > 0)
        while (__n--)
      ++__i;
      else
        while (__n++)
      --__i;
    }
template<typename _RandomAccessIterator, typename _Distance>
    inline void
    __advance(_RandomAccessIterator& __i, _Distance __n,
              random_access_iterator_tag)
    {
      __i += __n;
    }
   根据迭代器功能不同，重载的操作符不同，重载到不同的__advance。
   现在回到一开始的问题，iterator_traits目的是为了给指针加上迭代器的五个typedef，给迭代器重新定义五个一模一样的typedef，给又不是迭代器又不是指针的那些类型啥都不定义。那么结果有三种：
   1.指针有了迭代器的typedef之后能被专门提供给迭代器使用的泛型算法识别；
   2.迭代器该怎么样就怎么样，泛型算法本来就是为其而生的；
   3.其他类型过不了编译模板实例化std::__iterator_category(__i)；
   测试下吧：
    int iArr[10] = {0},*pi = iArr;
   std::advance(pi,3);
   模板实例化后生成的代码g++没有查到选项，先用gdb看：
   std::__advance<int*, int> (__i=@0xbffff0d4: 0xbffff0d8, __n=3) at /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:156
   156          __i += __n;
   重载到了随即存储的版本。
    std::forward_list<int> fli(10,1);
   auto iter = fli.begin();
   std::advance(iter,3);
   gdb后：
   std::__advance<std::_Fwd_list_iterator<int>, int> (__i=..., __n=3) at /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:128
   128          while (__n--)
   重载到了inputiterator的版本。
   std::string haha = "haha";
   std::advance(haha,3);
   编译不通过：
   /usr/include/c++/4.8/bits/stl_iterator_base_funcs.h:176:65: error: no type named ‘difference_type’ in ‘struct std::iterator_traits<std::basic_string<char> >’
   typename iterator_traits<_InputIterator>::difference_type __d = __n;
   总之，__iterator_traits统一了指针和迭代器对于泛型算法的“接口”typedef，也对非此两类的类型提供了编译时的错误提示，尽管提示不友好。

0 0