Search code examples
c++boostboost-graphrdkitboost-property-map

How to create a boost property map between rdkit molecular bond graph edges and BGL biconnected component identifiers?


The rdkit library provides a molecule class ROMol that provides a member function getTopology that returns a BGL graph of type adjacency_list<vecS, vecS, undirectedS, Atom *, Bond *>. I understand that the rdkit type Bond defines the edge properties of the graph. I know that Bond provides a member function getIdx that returns a unique integer identifying the bond, so the graph must have a notion of edge numbering.

To make use of the BGL algorithm for biconnected_components one requires a component property map that maps objects of the edge descriptor type of the graph (I understand this type is Bond?) to component identifying integers. As rdkit does not deal in biconnected components, I conclude this property map must be exterior to the graph. In this case, the BGL documentation suggests to use the iterator_property_map adaptor class (see section "Exterior Properties"). I struggle to determine the correct template arguments for iterator_property_map to get the required property map. If this is the correct approach what are the missing template arguments?

Unfortunately I did not get very far with my code before getting lost in the BGL documentation:

void my_function(const RDKit::ROMol& molecule) {
 const RDKit::MolGraph& molecule_graph{molecule.getTopology()};

  using EdgesSize =
      boost::graph_traits<RDKit::MolGraph>::edges_size_type; // map value type
  using Edge =
      boost::graph_traits<RDKit::MolGraph>::edge_descriptor; // map key type

  std::vector<EdgesSize> component_map(boost::num_edges(molecule_graph)
  ); // range, begin() yields random access iterator

  // boost::iterator_property_map<?>;
  // boost::biconnected_components(molecule_graph, ?);
}

Solution

  • one requires a component property map that maps objects of the edge descriptor type of the graph (I understand this type is Bond?)

    Edge descriptors are internal descriptors, sort of like stable iterators. E.g.

     edge_descriptor e = *edges(g).begin();
    

    Edge properties are entirely separate notions. You get the edge properties from an edge descriptor like so:

     Bond* e_props = g[e]; // equivalent to:
     e_props = get(boost::edge_bundle, g, e);
    

    Unsurprisingly the same goes for vertex descriptors vs. properties:

      Atom* first_a_prop = g[vertex(0, g)];
    

    The Complication

    A notable difference between the two descriptor types is that - only because the graph type is using vecS as the vertex container selector - the vertex descriptor is guaranteed to be integral, where edge descriptor is opaque (similar to a void*).

    Hence, the edge-decriptor cannot be the key type to a vector-based property map (because it would require an integral indexer).

    Instead make an associative property-map:

    // OUT: ComponentMap c
    // must be a model of Writable Property Map. The value type should be
    // an integer type, preferably the same as the edges_size_type of the
    // graph. The key type must be the graph's edge descriptor type.
    std::map<edge_descriptor, int> component_map;
    auto c = boost::make_assoc_property_map(component_map);
    

    Sample

    Live On Coliru

    #include <boost/graph/adjacency_list.hpp>
    #include <boost/graph/biconnected_components.hpp>
    #include <iostream>
    
    struct Atom {
    };
    
    struct Bond {
        int idx_ = 42;
        int getIdx() const { return idx_; }
    };
    
    using G = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, Atom*, Bond*>;
    using vertex_descriptor = G::vertex_descriptor;
    using edge_descriptor   = G::edge_descriptor;
    
    int main() {
        std::array<Atom, 4> atoms;
        std::array<Bond, 2> bonds{Bond{111}, Bond{222}};
    
        G g;
        add_vertex(atoms.data()+0, g);
        add_vertex(atoms.data()+1, g);
        add_vertex(atoms.data()+2, g);
        add_vertex(atoms.data()+3, g);
    
        // using the fact that vertex_descriptor is 0..3 here:
        add_edge(0, 2, bonds.data()+0, g);
        add_edge(2, 3, bonds.data()+1, g);
    
        {
            // OUT: ComponentMap c
            // must be a model of Writable Property Map. The value type should be
            // an integer type, preferably the same as the edges_size_type of the
            // graph. The key type must be the graph's edge descriptor type.
            std::map<edge_descriptor, int> component_map;
            auto c = boost::make_assoc_property_map(component_map);
    
            // Now use it:
            size_t n = boost::biconnected_components(g, c);
            for (auto [e, component_id] : component_map) {
                std::cout << "Edge " << e << " (idx:" << g[e]->getIdx()
                          << ") mapped to component " << component_id << " out of "
                          << n << "\n";
            }
        }
    
        {
            std::map<edge_descriptor, int> component_map;
            auto c = boost::make_assoc_property_map(component_map);
    
            // also writing articulation points:
            [[maybe_unused]] auto [n, out] = boost::biconnected_components(
                g, c,
                std::ostream_iterator<vertex_descriptor>(
                    std::cout << "articulation points: ", " "));
    
            std::cout << "\n";
        }
    
    }
    

    Prints

    Edge (0,2) (idx:111) mapped to component 1 out of 2
    Edge (2,3) (idx:222) mapped to component 0 out of 2
    articulation points: 2 
    

    Advanced Example

    You can force a vector as mapping storage, but that requires you to map the edges (bonds) to a contiguous integral range [0..num_edges(g)).

    I couldn't assume that getIdx() satisfies the criterion, but if it did:

    // first map edges to 0..num_edges using getIdx
    auto edge_index = boost::make_function_property_map<edge_descriptor>(
        [&g](edge_descriptor e) { return g[e]->getIdx(); });
    
    // provide num_edges storage for component-ids:
    std::vector<int> component_ids(num_edges(g));
    
    // project the vector through edge_index to make a Writable Property
    // Map indexed by edge_descriptor;
    auto c = boost::make_safe_iterator_property_map(component_ids.begin(),
            component_ids.size(), edge_index);
    

    Let's apply it to the graph from the documentation:

    enter image description here

    Live On Coliru

    #include <boost/graph/adjacency_list.hpp>
    #include <boost/graph/biconnected_components.hpp>
    #include <boost/property_map/function_property_map.hpp>
    #include <iostream>
    
    enum letter : char { A, B, C, D, E, F, G, H, I };
    struct Atom {
        Atom(letter) {}
    };
    
    struct Bond {
        int idx_;
        int getIdx() const { return idx_; }
    };
    
    using Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, Atom*, Bond*>;
    using vertex_descriptor = Graph::vertex_descriptor;
    using edge_descriptor   = Graph::edge_descriptor;
    
    int main() {
        std::array<Atom, 9>  atoms{A, B, C, D, E, F, G, H, I};
        std::array<Bond, 11> bonds{
            {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}}};
    
        Graph g;
        for (auto& atom : atoms)
            add_vertex(&atom, g);
    
        // using the fact that vertex_descriptor is vertex index:
        add_edge(A, B, &bonds.at(0), g);
        add_edge(A, F, &bonds.at(1), g);
        add_edge(A, G, &bonds.at(2), g);
        add_edge(B, C, &bonds.at(3), g);
        add_edge(B, D, &bonds.at(4), g);
        add_edge(B, E, &bonds.at(5), g);
        add_edge(C, D, &bonds.at(6), g);
        add_edge(E, F, &bonds.at(7), g);
        add_edge(G, H, &bonds.at(8), g);
        add_edge(G, I, &bonds.at(9), g);
        add_edge(H, I, &bonds.at(10), g);
    
        // OUT: ComponentMap c
        // must be a model of Writable Property Map. The value type should be
        // an integer type, preferably the same as the edges_size_type of the
        // graph. The key type must be the graph's edge descriptor type.
    
        // first map edges to 0..10 using getIdx
        auto edge_index = boost::make_function_property_map<edge_descriptor>(
                [&g](edge_descriptor e) { return g[e]->getIdx(); });
    
        // provide num_edges storage for component-ids:
        std::vector<int> component_ids(num_edges(g));
    
        // project the vector through edge_index to make a Writable Property
        // Map indexed by edge_descriptor;
        auto c = boost::make_safe_iterator_property_map(component_ids.begin(),
                component_ids.size(), edge_index);
    
        // Now use it:
        size_t n = boost::biconnected_components(g, c);
    
        for (auto e : boost::make_iterator_range(edges(g))) {
            // edge_index or getIdx, equivalent here:
            assert(edge_index[e] == g[e]->getIdx());
    
            auto idx =edge_index[e];
            auto cid = component_ids.at(idx);
    
            std::cout << "Edge " << e << " (idx:" << idx << ") mapped to component "
                      << cid << " out of " << n << "\n";
        }
    }
    

    Which prints prints the expected mapping

    Edge (0,1) (idx:0) mapped to component 1 out of 4
    Edge (0,5) (idx:1) mapped to component 1 out of 4
    Edge (0,6) (idx:2) mapped to component 3 out of 4
    Edge (1,2) (idx:3) mapped to component 0 out of 4
    Edge (1,3) (idx:4) mapped to component 0 out of 4
    Edge (1,4) (idx:5) mapped to component 1 out of 4
    Edge (2,3) (idx:6) mapped to component 0 out of 4
    Edge (4,5) (idx:7) mapped to component 1 out of 4
    Edge (6,7) (idx:8) mapped to component 2 out of 4
    Edge (6,8) (idx:9) mapped to component 2 out of 4
    Edge (7,8) (idx:10) mapped to component 2 out of 4
    

    In fact, if we add a little bit of bonus wizardry, we can render that:

    enter image description here