Search code examples
c++graphboostboost-graph

How to set custom vertex_descriptor to own value using add_vertex() - Boost Graph Library


I am in the process of learning how the Boost Graph Library (BGL) works and spun up the following example to try create a graph from some data in a text file. The data is read in from the text file in which the first line defines the number of nodes, the second line the number of edges and a succession of lines describing which vertex connects to which and the cost/weight of that edge:

8
16
4 5 0.35
4 7 0.37
5 7 0.28
0 7 0.16
1 5 0.32
0 4 0.38
2 3 0.17
1 7 0.19
0 2 0.26
1 2 0.36
1 3 0.29
2 7 0.34
6 2 0.40
3 6 0.52
6 0 0.58
6 4 0.93

So far I have come up with this code to read in the filedata and create the graph from it:

#include <unordered_map>
#include <fstream>
#include <boost/graph/adjacency_list.hpp>
#include <boost/config.hpp>
#include <boost/algorithm/string/split.hpp>

int main() {
  // read in the graph from 'tiny-ewg.txt'
  std::ifstream datafile("tiny-ewg.txt");

  if (!datafile) {
    std::cerr << "tiny-ewg.txt was not found" << std::endl;
    return EXIT_FAILURE;
  };

  typedef boost::adjacency_list
  <boost::vecS,
   boost::vecS, 
   boost::undirectedS,
   boost::property<boost::vertex_index_t, size_t>,
   boost::property<boost::edge_weight_t, double>
  > Graph;

  typedef std::pair<int, int> Edge;

  // read in number of vertices
  std::string line;
  std::getline(datafile, line);
  const int num_vertices = std::stoi(line);

  // read in number of edges
  std::getline(datafile, line);
  const int num_edges = std::stoi(line);

  Graph g(num_vertices);

  // unordered map tiny_ewg_vertex to vertex_descriptor
  typedef boost::graph_traits<Graph>::vertex_descriptor Vertex;
  typedef std::unordered_map<int, Vertex> VertexMap;
  VertexMap vertex_map;

  // property map for the edge weight
  boost::property_map<Graph, boost::edge_weight_t>::type weight_map = 
  boost::get(boost::edge_weight, g);

  for (std::string line; std::getline(datafile, line);) {
    std::cout << line << std::endl;
    typedef std::vector<std::string> Tokens;
    Tokens tokens;
    boost::split(tokens, line, boost::is_any_of(" "));
  
    auto tok_it = tokens.begin();
    bool inserted;
    Vertex u, v;
    VertexMap::iterator pos;
    boost::tie(pos, inserted) = vertex_map.insert(std::make_pair(stoi(*tok_it), Vertex()));
    if (inserted) {
      u = boost::add_vertex(g);
      pos->second = u;
    } else {
      u = pos->second;
    }

    tok_it++;
 
    boost::tie(pos, inserted) = vertex_map.insert(std::make_pair(stoi(*tok_it), Vertex()));
    if (inserted) {
      v = boost::add_vertex(g);
      pos->second = v;
    } else {
      v = pos->second;
    }

    // add edge between u and v
    boost::graph_traits<Graph>::edge_descriptor e;
    boost::tie(e, inserted) = boost::add_edge(u, v, g);

    // add the weight to the edge using a weight property map
    if (inserted) {
      tok_it++;
      weight_map[e] = stod(*tok_it);
    }
  }
}

I defined a small function to print out the edges as well:

template<typename Graph, typename WeightMap>
void printEdges(const Graph& g, WeightMap w) {
  typename typedef boost::graph_traits<Graph>::edge_iterator EdgeIterator;
  EdgeIterator it, end;
  for (boost::tie(it, end) = boost::edges(g); it != end; ++it) {
    std::cout << boost::source(*it, g) << "  " << " --( " << w[*it] << " )--> " << "  " << 
    boost::target(*it, g) << std::endl; 
  }
}

When passing the newly created graph to the function above I would expect to see as console output a formatted version of the data in the text file, for instance 4 --(0.35)--> 5 instead of 4 5 0.35 however I end up getting the following:

8   --( 0.35 )-->   9
8   --( 0.37 )-->   10
9   --( 0.28 )-->   10
11   --( 0.16 )-->   10
12   --( 0.32 )-->   9
11   --( 0.38 )-->   8
13   --( 0.17 )-->   14
12   --( 0.19 )-->   10
11   --( 0.26 )-->   13
12   --( 0.36 )-->   13
12   --( 0.29 )-->   14
13   --( 0.34 )-->   10
15   --( 0.4 )-->   13
14   --( 0.52 )-->   15
15   --( 0.58 )-->   11
15   --( 0.93 )-->   8

If I print out the unordered_map vertex_map I get:

4 : 8
5 : 9
7 : 10
0 : 11
1 : 12
2 : 13
3 : 14
6 : 15

With this I confirm that the key to my unordered_map is being added correctly, however the vertex_descriptors u,v returned by boost::add_vertex(g); are not the same as these indexes. I know that it is possible to set properties on the vertex with an additional named parameter in the call to add_vertex but I find that the documentation for this is quite poor on the BGL website.

  • Is it possible to set a custom vertex_descriptor and how?
  • How does BGL come up with what to set the vertex descriptor. For instance why is the vertex_descriptor of the first element added 8 and not 0 in the code above?
  • Should one even try to work with the vertex_descriptor, or is it better to work with the vertex_index?

Solution

  • The main problem is likely that you used add_vertex after already pre-sizing the graph to num_vertices².

    Is it possible to set a custom vertex_descriptor and how?

    No. The vertex descriptor is an implementation detail that is indirectly governed by your model's container selector template arguments.

    Should one even try to work with the vertex_descriptor, or is it better to work with the vertex_index?

    They serve different purposes. Many algorithms require a vertex index for efficiency, but it doesn't need to be built into the graph model. In practice, some graph models have a "natural" vertex index when using vecS. This doesn't mean the indices are meaningful at the application level, though you might be able to use them as is in some cases.

    But since you're here to learn BGL, let's dive a little deeper!

    Simplify

    Let's simplify the entire code³:

    Live On Coliru

    #include <boost/graph/adjacency_list.hpp>
    #include <boost/graph/graph_utility.hpp>
    #include <fstream>
    #include <iostream>
    
    struct NL_t {
        friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
    } inline constexpr NL;
    
    using Graph  = boost::adjacency_list<              //
        boost::vecS, boost::vecS, boost::undirectedS, //
        boost::no_property,                           //
        boost::property<boost::edge_weight_t, double>>;
    using Vertex = boost::graph_traits<Graph>::vertex_descriptor;
    
    Graph read_ewg(std::string fname) {
        std::ifstream datafile;
        datafile.exceptions(std::ios::failbit);
        datafile.open(fname);
    
        size_t nv = 0, ne = 0;
        datafile >> nv >> NL >> ne >> NL;
    
        Graph g(nv);
        Vertex u, v;
        double w;
        while (ne-- && datafile >> u >> v >> w >> NL)
            /*auto [e, inserted] =*/add_edge(u, v, w, g);
    
        assert(nv == num_vertices(g));
        return g;
    }
    
    int main() {
        auto g = read_ewg("tiny-ewg.txt");
        // auto weight_map = get(boost::edge_weight, g);
    
        print_graph(g);
    }
    

    Printing

    0 <--> 7 4 2 6 
    1 <--> 5 7 2 3 
    2 <--> 3 0 1 7 6 
    3 <--> 2 1 6 
    4 <--> 5 7 0 6 
    5 <--> 4 7 1 
    6 <--> 2 3 0 4 
    7 <--> 4 5 0 1 2 
    

    BONUS

    Let's also simplify printEdges (slightly changing output to no longer imply directed edges):

    template <typename Graph, typename WeightMap> //
    void printEdges(Graph const& g, WeightMap w) {
        for (auto e : make_iterator_range(edges(g)))
            std::cout << e << " w:" << w[e] << "\n";
    }
    

    Prints Live On Coliru

    (4,5) w:0.35
    (4,7) w:0.37
    (5,7) w:0.28
    (0,7) w:0.16
    (1,5) w:0.32
    (0,4) w:0.38
    (2,3) w:0.17
    (1,7) w:0.19
    (0,2) w:0.26
    (1,2) w:0.36
    (1,3) w:0.29
    (2,7) w:0.34
    (6,2) w:0.4
    (3,6) w:0.52
    (6,0) w:0.58
    (6,4) w:0.93
    

    Note that using vecS vertex container selector implies a built-in vertex index which is equal to the vertex descriptor. Therefore:

    1. We removed vertex_index_t property which can only confuse with the built-in index
    2. Since add_edge will automatically grow the vertex "space" we donot have add_vertex at all,
    3. Instead we just add edges as we want them, immediately setting the weight property as well

    Now the result is exactly what you expect.

    What If Everything's Not One Happy Song?

    The only reason this is the case is that the vertices are already mapped to the range [0..num_vertices(g)) in the source. If you renamed 4 to 14 the first edge to read 14 15 0.35 you'd see¹:

    0 <--> 7 14 2 6
    1 <--> 5 7 2 3
    2 <--> 3 0 1 7 6
    3 <--> 2 1 6
    4 <-->
    5 <--> 14 7 1
    6 <--> 2 3 0 14
    7 <--> 14 5 0 1 2
    8 <-->
    9 <-->
    10 <-->
    11 <-->
    12 <-->
    13 <-->
    14 <--> 5 7 0 6
    

    Not really an issue if you only focus on the edges, like printEdges:

    (14,5) w:0.35
    (14,7) w:0.37
    (5,7) w:0.28
    (0,7) w:0.16
    (1,5) w:0.32
    (0,14) w:0.38
    (2,3) w:0.17
    (1,7) w:0.19
    (0,2) w:0.26
    (1,2) w:0.36
    (1,3) w:0.29
    (2,7) w:0.34
    (6,2) w:0.4
    (3,6) w:0.52
    (6,0) w:0.58
    (6,14) w:0.93
    

    However, you might see a problem when the source graph looks like

    2
    1
    1236748 845723489 5.05
    

    The (re)allocation of the vertex container causes OOM on my machine. That's after 30s of high CPU. (So, be careful before trying that on your machine!)

    Even more general, what if the input is

    8
    16
    Barcelona Athens 0.35
    Amsterdam Barcelona 0.37
    London Barcelona 0.28
    Paris Barcelona 0.16
    Berlin London 0.32
    Paris Amsterdam 0.38
    Rome Madrid 0.17
    Berlin Barcelona 0.19
    Paris Rome 0.26
    Berlin Rome 0.36
    Berlin Madrid 0.29
    Rome Barcelona 0.34
    Athens Rome 0.40
    Madrid Athens 0.52
    Athens Paris 0.58
    Athens Amsterdam 0.93
    

    Names

    Well, glad you asked. I'd use bundled properties to express the logical ids (which will never become vertex descriptors let alone vertex index for BGL):

    struct VertexProps {
        std::string id;
    };
    
    struct EdgeProps {
        double weight;
    };
    
    using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
    using Vertex = boost::graph_traits<Graph>::vertex_descriptor;
    

    Now the naive edit will almost do what we need:

    #include <boost/graph/adjacency_list.hpp>
    #include <boost/graph/graph_utility.hpp>
    #include <fstream>
    #include <iostream>
    
    template <typename Graph, typename WeightMap> //
    void printEdges(Graph const& g, WeightMap w) {
        for (auto e : make_iterator_range(edges(g)))
            std::cout << e << " w:" << w[e] << "\n";
    }
    
    struct NL_t {
        friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
    } inline constexpr NL;
    
    using Id = std::string;
    struct VertexProps {
        Id id;
    };
    
    struct EdgeProps {
        double weight;
    };
    
    using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
    using Vertex = boost::graph_traits<Graph>::vertex_descriptor;
    
    Graph read_ewg(std::string fname) {
        std::ifstream datafile;
        datafile.exceptions(std::ios::failbit);
        datafile.open(fname);
    
        size_t nv = 0, ne = 0;
        datafile >> nv >> NL >> ne >> NL;
    
        Graph g;
        Id u, v;
        double w;
        while (ne-- && datafile >> u >> v >> w >> NL)
            add_edge(add_vertex({u}, g), add_vertex({v}, g), {w}, g);
    
        assert(nv == num_vertices(g));
        return g;
    }
    
    int main() {
        auto g = read_ewg("tiny-ewg.txt");
        auto weight_map = get(&EdgeProps::weight, g);
        print_graph(g, get(&VertexProps::id, g));
        printEdges(g, weight_map);
    }
    

    Printing

    Athens <--> Barcelona 
    Barcelona <--> Athens 
    Barcelona <--> Amsterdam 
    Amsterdam <--> Barcelona 
    Barcelona <--> London 
    London <--> Barcelona 
    Barcelona <--> Paris 
    Paris <--> Barcelona 
    London <--> Berlin 
    Berlin <--> London 
    Amsterdam <--> Paris 
    Paris <--> Amsterdam 
    Madrid <--> Rome 
    Rome <--> Madrid 
    Barcelona <--> Berlin 
    Berlin <--> Barcelona 
    Rome <--> Paris 
    Paris <--> Rome 
    Rome <--> Berlin 
    Berlin <--> Rome 
    Madrid <--> Berlin 
    Berlin <--> Madrid 
    Barcelona <--> Rome 
    Rome <--> Barcelona 
    Rome <--> Athens 
    Athens <--> Rome 
    Athens <--> Madrid 
    Madrid <--> Athens 
    Paris <--> Athens 
    Athens <--> Paris 
    Amsterdam <--> Athens 
    Athens <--> Amsterdam 
    

    And

    (1,0) w:0.35
    (3,2) w:0.37
    (5,4) w:0.28
    (7,6) w:0.16
    (9,8) w:0.32
    (11,10) w:0.38
    (13,12) w:0.17
    (15,14) w:0.19
    (17,16) w:0.26
    (19,18) w:0.36
    (21,20) w:0.29
    (23,22) w:0.34
    (25,24) w:0.4
    (27,26) w:0.52
    (29,28) w:0.58
    (31,30) w:0.93
    

    Deduplicating

    Obviously the problem is the duplication of vertices with the same id.

    You can go about that just like you did with a map:

    std::map<Id, Vertex> vmap;
    
    auto ensure_vertex = [&](Id const& id) {
        auto it = vmap.find(id);
        if (it == vmap.end())
            it = vmap.emplace(id, add_vertex(VertexProps{id}, g)).first;
        return it->second;
    };
    
    while (ne-- && datafile >> u >> v >> w >> NL)
        add_edge(ensure_vertex(u), ensure_vertex(v), {w}, g);
    

    Printing Live On Coliru

    Athens <--> Barcelona Rome Madrid Paris Amsterdam 
    Barcelona <--> Athens Amsterdam London Paris Berlin Rome 
    Amsterdam <--> Barcelona Paris Athens 
    London <--> Barcelona Berlin 
    Paris <--> Barcelona Amsterdam Rome Athens 
    Berlin <--> London Barcelona Rome Madrid 
    Madrid <--> Rome Berlin Athens 
    Rome <--> Madrid Paris Berlin Barcelona Athens 
    

    And the original printEdges like

    (1,0) w:0.35
    (2,1) w:0.37
    (3,1) w:0.28
    ...
    (7,1) w:0.34
    ...
    (0,2) w:0.93
    

    Internal Name Properties!

    But I've come to use the named vertices as the more user-friendly way of adding a "name" to vertices:

    template <> struct boost::graph::internal_vertex_name<VertexProps> {
        struct type {
            using result_type = Id;
            Id const& operator()(VertexProps const& vp) const { return vp.id; }
        };
    };
    

    This tells BGL that your vertex propery bundle contains a "name" and how to get it. If you also tell BGL how to construct a new bundle from an Id:

    template <> struct boost::graph::internal_vertex_constructor<VertexProps> {
        struct type {
            VertexProps operator()(Id id) const { return {std::move(id)}; }
        };
    };
    

    Now you can write the original "naive" code again:

    Id u, v;
    double w;
    while (ne-- && datafile >> u >> v >> w >> NL)
        add_edge(u, v, {w}, g);
    

    In effect BGL keeps the map internally and does exactly the same checks to see whether a vertex already exists or not:

    Live On Coliru

    #include <boost/graph/adjacency_list.hpp>
    #include <boost/graph/graph_utility.hpp>
    #include <fstream>
    #include <iostream>
    
    template <typename Graph, typename WeightMap> //
    void printEdges(Graph const& g, WeightMap w) {
        for (auto e : make_iterator_range(edges(g)))
            std::cout << e << " w:" << w[e] << "\n";
    }
    
    struct NL_t {
        friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
    } inline constexpr NL;
    
    using Id = std::string;
    struct VertexProps {
        Id id;
        VertexProps(Id id = {}) : id(std::move(id)) {}
    };
    
    struct EdgeProps {
        double weight;
    };
    
    template <> struct boost::graph::internal_vertex_name<VertexProps> {
        struct type {
            using result_type = Id;
            Id const& operator()(VertexProps const& vp) const { return vp.id; }
        };
    };
    
    template <> struct boost::graph::internal_vertex_constructor<VertexProps> {
        struct type {
            VertexProps operator()(Id id) const { return {std::move(id)}; }
        };
    };
    
    using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
    using Vertex = boost::graph_traits<Graph>::vertex_descriptor;
    
    Graph read_ewg(std::string fname) {
        std::ifstream datafile;
        datafile.exceptions(std::ios::failbit);
        datafile.open(fname);
    
        size_t nv = 0, ne = 0;
        datafile >> nv >> NL >> ne >> NL;
    
        Graph g;
    
        Id u, v;
        double w;
        while (ne-- && datafile >> u >> v >> w >> NL)
            add_edge(u, v, {w}, g);
    
        assert(nv == num_vertices(g));
        return g;
    }
    
    int main() {
        auto g = read_ewg("tiny-ewg.txt");
        auto weight_map = get(&EdgeProps::weight, g);
        print_graph(g, get(&VertexProps::id, g));
        printEdges(g, weight_map);
    }
    

    Printing the expected

    Athens <--> Barcelona Rome Madrid Paris Amsterdam 
    Barcelona <--> Athens Amsterdam London Paris Berlin Rome 
    Amsterdam <--> Barcelona Paris Athens 
    London <--> Barcelona Berlin 
    Paris <--> Barcelona Amsterdam Rome Athens 
    Berlin <--> London Barcelona Rome Madrid 
    Madrid <--> Rome Berlin Athens 
    Rome <--> Madrid Paris Berlin Barcelona Athens 
    (1,0) w:0.35
    (2,1) w:0.37
    (3,1) w:0.28
    (4,1) w:0.16
    (5,3) w:0.32
    (4,2) w:0.38
    (7,6) w:0.17
    (5,1) w:0.19
    (4,7) w:0.26
    (5,7) w:0.36
    (5,6) w:0.29
    (7,1) w:0.34
    (0,7) w:0.4
    (6,0) w:0.52
    (0,4) w:0.58
    (0,2) w:0.93
    

    ¹ of course quietly assuming you removed/silenced the debug assertion assert(nv == num_vertices(g));

    ² this immediately answers "why is the vertex_descriptor of the first element added 8 and not 0"

    ³ the only non-standard bit is my NL istream manipulator which is just a shorthand to discard the rest of the line. It's actually not required given the sample input because newlines are just whitespace. For fun: look at this slightly more complicated input https://coliru.stacked-crooked.com/a/d86be97eced3b2a1 (input from here)