Search code examples
c++c++11stlstl-algorithm

How can I use std::set_intersection for 2 distinct but related types and output into another type


I need to do something a little strange with std::set_intersection and I cannot quite figure it out. I asked a similar question about a month ago, and thanks to the excellent responses to the question, I solved the issue of making the std::set_intersection work using a common link field between 2 vectors, each containing a different type of object.

The problem that I am now facing is that I am trying to get the code below to work, I basically need to write std::set_intersection's output to a new type which is effectively a union between some fields from StructA and other fields from StructB. I used a slightly modified sample written by user tclamb but it doesn't compile and I am a bit lost in the compiler errors. I am pretty sure that some of the problems I am facing are to do with the restriction that

According to the section Requirements on types in std::set_intersection InputIterator1 and InputIterator2 have the same value type. In my case this is not true, also in the case of the solution by tclamb it was also not the case, however it seemed to work.

I just edited the code below and incorporated @ivar's suggestions for some redundant code - this makes the problem easier to read - it is now compiling and running - but still producing results that are not quite what I want. The live code is also posted at coliru

#include<vector>
#include<algorithm>
#include<string>
#include <iostream>
#include <iterator>

// I wish to return a vector of these as the result
struct StructC {
    std::string mCommonField;
    std::string mNameFromA; // cherry picked from StructA
    std::string mNameFromB; // cherry picked from StructB
    float mFloatFromA;      // cherry picked from StructA
    int mIntFromB;          // cherry picked from StructB
};

struct StructA {
    // conversion operator from StructA to StructC
    operator StructC() { return { mCommonField, mNameAString, "[]", mFloatValueA, 0 }; }
    std::string mCommonField;
    std::string mNameAString;
    float mFloatValueA;
};

struct StructB {
    // conversion operator from StructB to StructC
    operator StructC() { return { mCommonField, "[]", mNameBString, 0.0f, mIntValueB }; }
    std::string mCommonField;
    std::string mNameBString;
    int mIntValueB;
};

// Comparator thanks to @ivar
struct Comparator {
    template<typename A, typename B>
    bool operator()(const A& a, const B& b) const {
        return a.mCommonField < b.mCommonField;
    }
};

template<typename CharT, typename Traits>
std::basic_ostream<CharT, Traits>& operator<<(std::basic_ostream<CharT, Traits>& os, StructC const& sc) {
    return os << sc.mCommonField << " - " << sc.mNameFromA << " - " 
       << sc.mNameFromB << " - " << std::fixed << sc.mFloatFromA << " - " << sc.mIntFromB << std::endl;
}

int main() {
    Comparator comparator;

    // initially unsorted list of StructA
    std::vector<StructA> aStructs = {
        {"hello", "nameA1", 1.0f}, 
        {"goodbye", "nameA2", 2.0f}, 
        {"foo", "nameA3", 3.0f}
    };

    // initially unsorted list of StructB
    std::vector<StructB> bStructs = {
        {"hello", "nameB1", 10},     // <-- intersection as mCommonField("hello") also in aStructs
        {"goodbye", "nameB2", 20},   // <-- intersection as mCommonField("goodbye") also in aStructs
        {"bar", "nameB3", 30}
    };

    // in the above list, "hello" & "goodbye" are the common in both aStructs & bStructs

    // pre-sort both sets before calling std::intersection
    std::sort(aStructs.begin(), aStructs.end(), comparator);
    std::sort(bStructs.begin(), bStructs.end(), comparator);

    std::vector<StructC> intersection;
    std::set_intersection(aStructs.begin(), aStructs.end(),
                          bStructs.begin(), bStructs.end(),
                          std::back_inserter(intersection),
                          comparator);

    std::copy(intersection.begin(), intersection.end(),
              std::ostream_iterator<StructC>(std::cout, ""));
    return 0;
}

Solution

  • I think I now have a better idea what you were originally looking for:

    Given two input ranges A, B, for every two instances a in A, b in B found to be equivalent, you want to merge a, b into a new object c and copy that to the output range C.

    If I got this right now, I am afraid I can think of no combination of standard algorithms that can accomplish that. std::set_intersection is simply not that generic.

    So my second attempt is to generalize std::set_intersection itself by adding a Merger function object:

    template<
        class InputIt1, class InputIt2,
        class OutputIt, class Compare, class Merge
    >
    OutputIt set_intersection
    (
        InputIt1 first1, InputIt1 last1,
        InputIt2 first2, InputIt2 last2,
        OutputIt d_first, Compare comp, Merge merge
    )
    {
        while (first1 != last1 && first2 != last2)
        {
            if (comp(*first1, *first2))
                ++first1;
            else
            {
                if (!comp(*first2, *first1))
                    *d_first++ = merge(*first1++, *first2);
                ++first2;
            }
        }
        return d_first;
    }
    

    Here's what your Merger class can look like:

    struct Merger
    {
        template<typename A, typename B>
        StructC operator()(const A& a, const B& b) const { return {a, b}; }
    };
    

    and here's how StructC actually handles merging:

    struct StructC {
        std::string mCommonField;
        std::string mNameFromA;
        std::string mNameFromB;
        float mFloatFromA;
        int mIntFromB;
    
        StructC(const StructA& a, const StructB& b) : mCommonField{a.mCommonField},
            mNameFromA{a.mNameAString}, mNameFromB{b.mNameBString},
            mFloatFromA{a.mFloatValueA}, mIntFromB{b.mIntValueB} {}
    };
    

    As you require, the live example now gives

    goodbye - nameA2 - nameB2 - 2.000000 - 20
    hello - nameA1 - nameB1 - 1.000000 - 10
    

    Watch out: the code of std::set_intesection is just what I copied from cppreference.com, which is a simplified functional equivalent. In production code, I would check actual implementations for issues related to forwarding, exceptions, special cases, optimizations and anything I haven't thought of.