Search code examples
c++templatesgenericshistogramconditional-compilation

Generic STL-Compatible Histogram in C++11


Here is my first try at a generic histogram template function in C++ tested with GCC 4.6. However, I would like to merge dense_histogram() and sparse_histogram() into one common generic function template. The problem is that the dense-specific constructor H h(n, 0) is neither defined nor relevant in the sparse version H h. Is there a way to solve this in some clever C++ regular way or statically typically using conditional compilation through Boost Type.Traits (#include <boost/type_traits.hpp>)?

#include <algorithm>
#include <limits>
#include <algorithm>
#include <vector>
#include <unordered_map>

namespace std
{

/*!
 * \em Dense Histogram of \p a.
 *
 * \tparam V is Value Type.
 * \tparam C is Count (Bin) Type.
 * \tparam H is Histogram Storage Type, typically a vector.
 *
 * \param[in] x is a set of the input data set
 */
template <class V, class C = size_t, class H = vector<C> >
inline
H dense_histogram(const V & x)
{
    typedef typename V::value_type E; // element type
    size_t n = (static_cast<C>(1)) << (8*sizeof(E)); // maximum number of possible elements for dense variant
    H h(n, 0);                       // histogram
    C bmax = 0;                      // bin max
    for_each(begin(x), end(x),  // C++11
             [&h, &bmax] (const E & e) { // value element
                 h[e]++;
                 bmax = std::max(bmax, h[e]);
             });
    return h;
}
template <class V, class H = vector<size_t> > H make_dense_histogram(const V & x) { return dense_histogram<V, size_t, H>(x); }

/*!
 * \em Sparse Histogram of \p a.
 *
 * \tparam V is Value Type.
 * \tparam C is Count (Bin) Type.
 * \tparam H is Histogram Structure Type, typically a unordered_map.
 *
 * \param[in] x is a set of the input data set
 */
template <class V, class C = size_t, class H = unordered_map<typename V::value_type, C> >
inline
H sparse_histogram(const V & x)
{
    typedef typename V::value_type E; // element type
    H h;                        // histogram
    C bmax = 0;                 // bin max
    for_each(begin(x), end(x), // C++11
             [&h,&bmax] (const E & e) { // value element
                 h[e]++;
                 bmax = std::max(bmax, h[e]);
             });
    return h;
}
template <class V, class H = unordered_map<typename V::value_type, size_t> > H make_sparse_histogram(const V & x) { return sparse_histogram<V, size_t, H>(x); }

}

run using


Solution

  • I think you should simply put only the common parts in a third function leaving dense_histogram and sparse_histogram to create h and call that implementation function:

    template <class V, class C = size_t, class H>
    inline void histogram_impl(const V & x, H& h) {
        typedef typename V::value_type E; // element type
        C bmax = 0;                      // bin max
        for_each(begin(x), end(x),  // C++11
                 [&h, &bmax] (const E & e) { // value element
                     h[e]++;
                     bmax = std::max(bmax, h[e]);
                 });
        return h;
    }
    template <class V, class C = size_t, class H = vector<C> >
    inline H dense_histogram(const V & x) {
        typedef typename V::value_type E; // element type
        size_t n = (static_cast<C>(1)) << (8*sizeof(E)); // maximum number of possible elements for dense variant
        H h(n, 0);                       // histogram
        histogram_impl(x, h);
        return h;
    }
    template <class V, class C = size_t, class H = unordered_map<typename V::value_type, C> >
    inline H sparse_histogram(const V & x) {
        H h;                        // histogram
        histogram_impl(x, h);
        return h;
    }
    

    However since you asked for it: As you are working on containers I would assume they have a cheap move, so you could define a creation trait to generate your container and move that into your local variable. Then you can write your own detection of an appropriate constructor like this:

    template<typename T> struct has_explicit_length_constructor{
    private:
       template<typename U>
       decltype(U(0, 0), void(), std::true_type()) test(int x);
       template<typename>
       std::false_type test(...);
      typedef decltype(test<T>(0)) constant_type;
    public:
       constexpr bool value = constant_type::value;
    };
    
    template<class H, bool B = has_explicit_length_constructor<H>::value> struct histogram_creation_trait;
    template<class H> struct histogram_creation_trait<H, true> {
      static H create()  {
        size_t n = (static_cast<C>(1)) << (8*sizeof(typename V::value_type));
        return H(n, 0);  
      }
    };
    template<class H> struct histogram_creation_trait<H, false>
    { static H create()  { return H(); } };
    
    template <class V, class C = size_t, class Ht>
    inline void histogram_impl(const V & x, H& h, Trait) {
        typedef typename V::value_type E; // element type
        C bmax = 0;                      // bin max
        H h = histogram_creation_trait<H>::create();
        for_each(begin(x), end(x),  // C++11
                 [&h, &bmax] (const E & e) { // value element
                     h[e]++;
                     bmax = std::max(bmax, h[e]);
                 });
        return h;
    }
    template <class V, class H = vector<size_t> > H make_dense_histogram(const V & x) { return histogram_impl<V, size_t, H>(x); }
    template <class V, class H = unordered_map<typename V::value_type, size_t> > H make_sparse_histogram(const V & x) { return histogram_impl<V, size_t, H>(x); }
    

    As a side not: Adding your own methods to std is UB by the standard ([namespace.std] $17.6.4.2.1 p1):

    The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace within namespace std unless otherwise specified. A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited.