How to set custom vertex_descriptor to own value using add_vertex() - Boost Graph Library

I am in the process of learning how the Boost Graph Library (BGL) works and spun up the following example to try create a graph from some data in a text file. The data is read in from the text file in which the first line defines the number of nodes, the second line the number of edges and a succession of lines describing which vertex connects to which and the cost/weight of that edge:

So far I have come up with this code to read in the filedata and create the graph from it:

#include <unordered_map>
#include <fstream>
#include <boost/graph/adjacency_list.hpp>
#include <boost/config.hpp>
#include <boost/algorithm/string/split.hpp>

int main() {
  // read in the graph from 'tiny-ewg.txt'
  std::ifstream datafile("tiny-ewg.txt");

  if (!datafile) {
    std::cerr << "tiny-ewg.txt was not found" << std::endl;
    return EXIT_FAILURE;
  };

  typedef boost::adjacency_list
  <boost::vecS,
   boost::vecS, 
   boost::undirectedS,
   boost::property<boost::vertex_index_t, size_t>,
   boost::property<boost::edge_weight_t, double>
  > Graph;

  typedef std::pair<int, int> Edge;

  // read in number of vertices
  std::string line;
  std::getline(datafile, line);
  const int num_vertices = std::stoi(line);

  // read in number of edges
  std::getline(datafile, line);
  const int num_edges = std::stoi(line);

  Graph g(num_vertices);

  // unordered map tiny_ewg_vertex to vertex_descriptor
  typedef boost::graph_traits<Graph>::vertex_descriptor Vertex;
  typedef std::unordered_map<int, Vertex> VertexMap;
  VertexMap vertex_map;

  // property map for the edge weight
  boost::property_map<Graph, boost::edge_weight_t>::type weight_map = 
  boost::get(boost::edge_weight, g);

  for (std::string line; std::getline(datafile, line);) {
    std::cout << line << std::endl;
    typedef std::vector<std::string> Tokens;
    Tokens tokens;
    boost::split(tokens, line, boost::is_any_of(" "));
  
    auto tok_it = tokens.begin();
    bool inserted;
    Vertex u, v;
    VertexMap::iterator pos;
    boost::tie(pos, inserted) = vertex_map.insert(std::make_pair(stoi(*tok_it), Vertex()));
    if (inserted) {
      u = boost::add_vertex(g);
      pos->second = u;
    } else {
      u = pos->second;
    }

    tok_it++;
 
    boost::tie(pos, inserted) = vertex_map.insert(std::make_pair(stoi(*tok_it), Vertex()));
    if (inserted) {
      v = boost::add_vertex(g);
      pos->second = v;
    } else {
      v = pos->second;
    }

    // add edge between u and v
    boost::graph_traits<Graph>::edge_descriptor e;
    boost::tie(e, inserted) = boost::add_edge(u, v, g);

    // add the weight to the edge using a weight property map
    if (inserted) {
      tok_it++;
      weight_map[e] = stod(*tok_it);
    }
  }
}

I defined a small function to print out the edges as well:

template<typename Graph, typename WeightMap>
void printEdges(const Graph& g, WeightMap w) {
  typename typedef boost::graph_traits<Graph>::edge_iterator EdgeIterator;
  EdgeIterator it, end;
  for (boost::tie(it, end) = boost::edges(g); it != end; ++it) {
    std::cout << boost::source(*it, g) << "  " << " --( " << w[*it] << " )--> " << "  " << 
    boost::target(*it, g) << std::endl; 
  }
}

When passing the newly created graph to the function above I would expect to see as console output a formatted version of the data in the text file, for instance 4 --(0.35)--> 5 instead of 4 5 0.35 however I end up getting the following:

8   --( 0.35 )-->   9
8   --( 0.37 )-->   10
9   --( 0.28 )-->   10
11   --( 0.16 )-->   10
12   --( 0.32 )-->   9
11   --( 0.38 )-->   8
13   --( 0.17 )-->   14
12   --( 0.19 )-->   10
11   --( 0.26 )-->   13
12   --( 0.36 )-->   13
12   --( 0.29 )-->   14
13   --( 0.34 )-->   10
15   --( 0.4 )-->   13
14   --( 0.52 )-->   15
15   --( 0.58 )-->   11
15   --( 0.93 )-->   8

If I print out the unordered_map vertex_map I get:

With this I confirm that the key to my unordered_map is being added correctly, however the vertex_descriptors u,v returned by boost::add_vertex(g); are not the same as these indexes. I know that it is possible to set properties on the vertex with an additional named parameter in the call to add_vertex but I find that the documentation for this is quite poor on the BGL website.

Is it possible to set a custom vertex_descriptor and how?
How does BGL come up with what to set the vertex descriptor. For instance why is the vertex_descriptor of the first element added 8 and not 0 in the code above?
Should one even try to work with the vertex_descriptor, or is it better to work with the vertex_index?

Solution

The main problem is likely that you used add_vertex after already pre-sizing the graph to num_vertices².

Is it possible to set a custom vertex_descriptor and how?

No. The vertex descriptor is an implementation detail that is indirectly governed by your model's container selector template arguments.

Should one even try to work with the vertex_descriptor, or is it better to work with the vertex_index?

They serve different purposes. Many algorithms require a vertex index for efficiency, but it doesn't need to be built into the graph model. In practice, some graph models have a "natural" vertex index when using vecS. This doesn't mean the indices are meaningful at the application level, though you might be able to use them as is in some cases.

But since you're here to learn BGL, let's dive a little deeper!

Simplify

Let's simplify the entire code³:

Live On Coliru

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <fstream>
#include <iostream>

struct NL_t {
    friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
} inline constexpr NL;

using Graph  = boost::adjacency_list<              //
    boost::vecS, boost::vecS, boost::undirectedS, //
    boost::no_property,                           //
    boost::property<boost::edge_weight_t, double>>;
using Vertex = boost::graph_traits<Graph>::vertex_descriptor;

Graph read_ewg(std::string fname) {
    std::ifstream datafile;
    datafile.exceptions(std::ios::failbit);
    datafile.open(fname);

    size_t nv = 0, ne = 0;
    datafile >> nv >> NL >> ne >> NL;

    Graph g(nv);
    Vertex u, v;
    double w;
    while (ne-- && datafile >> u >> v >> w >> NL)
        /*auto [e, inserted] =*/add_edge(u, v, w, g);

    assert(nv == num_vertices(g));
    return g;
}

int main() {
    auto g = read_ewg("tiny-ewg.txt");
    // auto weight_map = get(boost::edge_weight, g);

    print_graph(g);
}

Printing

0 <--> 7 4 2 6 
1 <--> 5 7 2 3 
2 <--> 3 0 1 7 6 
3 <--> 2 1 6 
4 <--> 5 7 0 6 
5 <--> 4 7 1 
6 <--> 2 3 0 4 
7 <--> 4 5 0 1 2

BONUS

Let's also simplify printEdges (slightly changing output to no longer imply directed edges):

template <typename Graph, typename WeightMap> //
void printEdges(Graph const& g, WeightMap w) {
    for (auto e : make_iterator_range(edges(g)))
        std::cout << e << " w:" << w[e] << "\n";
}

Prints Live On Coliru

(4,5) w:0.35
(4,7) w:0.37
(5,7) w:0.28
(0,7) w:0.16
(1,5) w:0.32
(0,4) w:0.38
(2,3) w:0.17
(1,7) w:0.19
(0,2) w:0.26
(1,2) w:0.36
(1,3) w:0.29
(2,7) w:0.34
(6,2) w:0.4
(3,6) w:0.52
(6,0) w:0.58
(6,4) w:0.93

Note that using vecS vertex container selector implies a built-in vertex index which is equal to the vertex descriptor. Therefore:

We removed vertex_index_t property which can only confuse with the built-in index
Since add_edge will automatically grow the vertex "space" we donot have add_vertex at all,
Instead we just add edges as we want them, immediately setting the weight property as well

Now the result is exactly what you expect.

What If Everything's Not One Happy Song?

The only reason this is the case is that the vertices are already mapped to the range [0..num_vertices(g)) in the source. If you renamed 4 to 14 the first edge to read 14 15 0.35 you'd see¹:

0 <--> 7 14 2 6
1 <--> 5 7 2 3
2 <--> 3 0 1 7 6
3 <--> 2 1 6
4 <-->
5 <--> 14 7 1
6 <--> 2 3 0 14
7 <--> 14 5 0 1 2
8 <-->
9 <-->
10 <-->
11 <-->
12 <-->
13 <-->
14 <--> 5 7 0 6

Not really an issue if you only focus on the edges, like printEdges:

(14,5) w:0.35
(14,7) w:0.37
(5,7) w:0.28
(0,7) w:0.16
(1,5) w:0.32
(0,14) w:0.38
(2,3) w:0.17
(1,7) w:0.19
(0,2) w:0.26
(1,2) w:0.36
(1,3) w:0.29
(2,7) w:0.34
(6,2) w:0.4
(3,6) w:0.52
(6,0) w:0.58
(6,14) w:0.93

However, you might see a problem when the source graph looks like

2
1
1236748 845723489 5.05

The (re)allocation of the vertex container causes OOM on my machine. That's after 30s of high CPU. (So, be careful before trying that on your machine!)

Even more general, what if the input is

8
16
Barcelona Athens 0.35
Amsterdam Barcelona 0.37
London Barcelona 0.28
Paris Barcelona 0.16
Berlin London 0.32
Paris Amsterdam 0.38
Rome Madrid 0.17
Berlin Barcelona 0.19
Paris Rome 0.26
Berlin Rome 0.36
Berlin Madrid 0.29
Rome Barcelona 0.34
Athens Rome 0.40
Madrid Athens 0.52
Athens Paris 0.58
Athens Amsterdam 0.93

Names

Well, glad you asked. I'd use bundled properties to express the logical ids (which will never become vertex descriptors let alone vertex index for BGL):

struct VertexProps {
    std::string id;
};

struct EdgeProps {
    double weight;
};

using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
using Vertex = boost::graph_traits<Graph>::vertex_descriptor;

Now the naive edit will almost do what we need:

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <fstream>
#include <iostream>

template <typename Graph, typename WeightMap> //
void printEdges(Graph const& g, WeightMap w) {
    for (auto e : make_iterator_range(edges(g)))
        std::cout << e << " w:" << w[e] << "\n";
}

struct NL_t {
    friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
} inline constexpr NL;

using Id = std::string;
struct VertexProps {
    Id id;
};

struct EdgeProps {
    double weight;
};

using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
using Vertex = boost::graph_traits<Graph>::vertex_descriptor;

Graph read_ewg(std::string fname) {
    std::ifstream datafile;
    datafile.exceptions(std::ios::failbit);
    datafile.open(fname);

    size_t nv = 0, ne = 0;
    datafile >> nv >> NL >> ne >> NL;

    Graph g;
    Id u, v;
    double w;
    while (ne-- && datafile >> u >> v >> w >> NL)
        add_edge(add_vertex({u}, g), add_vertex({v}, g), {w}, g);

    assert(nv == num_vertices(g));
    return g;
}

int main() {
    auto g = read_ewg("tiny-ewg.txt");
    auto weight_map = get(&EdgeProps::weight, g);
    print_graph(g, get(&VertexProps::id, g));
    printEdges(g, weight_map);
}

Printing

Athens <--> Barcelona 
Barcelona <--> Athens 
Barcelona <--> Amsterdam 
Amsterdam <--> Barcelona 
Barcelona <--> London 
London <--> Barcelona 
Barcelona <--> Paris 
Paris <--> Barcelona 
London <--> Berlin 
Berlin <--> London 
Amsterdam <--> Paris 
Paris <--> Amsterdam 
Madrid <--> Rome 
Rome <--> Madrid 
Barcelona <--> Berlin 
Berlin <--> Barcelona 
Rome <--> Paris 
Paris <--> Rome 
Rome <--> Berlin 
Berlin <--> Rome 
Madrid <--> Berlin 
Berlin <--> Madrid 
Barcelona <--> Rome 
Rome <--> Barcelona 
Rome <--> Athens 
Athens <--> Rome 
Athens <--> Madrid 
Madrid <--> Athens 
Paris <--> Athens 
Athens <--> Paris 
Amsterdam <--> Athens 
Athens <--> Amsterdam

And

(1,0) w:0.35
(3,2) w:0.37
(5,4) w:0.28
(7,6) w:0.16
(9,8) w:0.32
(11,10) w:0.38
(13,12) w:0.17
(15,14) w:0.19
(17,16) w:0.26
(19,18) w:0.36
(21,20) w:0.29
(23,22) w:0.34
(25,24) w:0.4
(27,26) w:0.52
(29,28) w:0.58
(31,30) w:0.93

Deduplicating

Obviously the problem is the duplication of vertices with the same id.

You can go about that just like you did with a map:

std::map<Id, Vertex> vmap;

auto ensure_vertex = [&](Id const& id) {
    auto it = vmap.find(id);
    if (it == vmap.end())
        it = vmap.emplace(id, add_vertex(VertexProps{id}, g)).first;
    return it->second;
};

while (ne-- && datafile >> u >> v >> w >> NL)
    add_edge(ensure_vertex(u), ensure_vertex(v), {w}, g);

Printing Live On Coliru

Athens <--> Barcelona Rome Madrid Paris Amsterdam 
Barcelona <--> Athens Amsterdam London Paris Berlin Rome 
Amsterdam <--> Barcelona Paris Athens 
London <--> Barcelona Berlin 
Paris <--> Barcelona Amsterdam Rome Athens 
Berlin <--> London Barcelona Rome Madrid 
Madrid <--> Rome Berlin Athens 
Rome <--> Madrid Paris Berlin Barcelona Athens

And the original printEdges like

(1,0) w:0.35
(2,1) w:0.37
(3,1) w:0.28
...
(7,1) w:0.34
...
(0,2) w:0.93

Internal Name Properties!

But I've come to use the named vertices as the more user-friendly way of adding a "name" to vertices:

template <> struct boost::graph::internal_vertex_name<VertexProps> {
    struct type {
        using result_type = Id;
        Id const& operator()(VertexProps const& vp) const { return vp.id; }
    };
};

This tells BGL that your vertex propery bundle contains a "name" and how to get it. If you also tell BGL how to construct a new bundle from an Id:

template <> struct boost::graph::internal_vertex_constructor<VertexProps> {
    struct type {
        VertexProps operator()(Id id) const { return {std::move(id)}; }
    };
};

Now you can write the original "naive" code again:

Id u, v;
double w;
while (ne-- && datafile >> u >> v >> w >> NL)
    add_edge(u, v, {w}, g);

In effect BGL keeps the map internally and does exactly the same checks to see whether a vertex already exists or not:

Live On Coliru

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <fstream>
#include <iostream>

template <typename Graph, typename WeightMap> //
void printEdges(Graph const& g, WeightMap w) {
    for (auto e : make_iterator_range(edges(g)))
        std::cout << e << " w:" << w[e] << "\n";
}

struct NL_t {
    friend std::istream& operator>>(std::istream& is, NL_t) { return is.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); }
} inline constexpr NL;

using Id = std::string;
struct VertexProps {
    Id id;
    VertexProps(Id id = {}) : id(std::move(id)) {}
};

struct EdgeProps {
    double weight;
};

template <> struct boost::graph::internal_vertex_name<VertexProps> {
    struct type {
        using result_type = Id;
        Id const& operator()(VertexProps const& vp) const { return vp.id; }
    };
};

template <> struct boost::graph::internal_vertex_constructor<VertexProps> {
    struct type {
        VertexProps operator()(Id id) const { return {std::move(id)}; }
    };
};

using Graph  = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, VertexProps, EdgeProps>;
using Vertex = boost::graph_traits<Graph>::vertex_descriptor;

Graph read_ewg(std::string fname) {
    std::ifstream datafile;
    datafile.exceptions(std::ios::failbit);
    datafile.open(fname);

    size_t nv = 0, ne = 0;
    datafile >> nv >> NL >> ne >> NL;

    Graph g;

    Id u, v;
    double w;
    while (ne-- && datafile >> u >> v >> w >> NL)
        add_edge(u, v, {w}, g);

    assert(nv == num_vertices(g));
    return g;
}

int main() {
    auto g = read_ewg("tiny-ewg.txt");
    auto weight_map = get(&EdgeProps::weight, g);
    print_graph(g, get(&VertexProps::id, g));
    printEdges(g, weight_map);
}

Printing the expected

Athens <--> Barcelona Rome Madrid Paris Amsterdam 
Barcelona <--> Athens Amsterdam London Paris Berlin Rome 
Amsterdam <--> Barcelona Paris Athens 
London <--> Barcelona Berlin 
Paris <--> Barcelona Amsterdam Rome Athens 
Berlin <--> London Barcelona Rome Madrid 
Madrid <--> Rome Berlin Athens 
Rome <--> Madrid Paris Berlin Barcelona Athens 
(1,0) w:0.35
(2,1) w:0.37
(3,1) w:0.28
(4,1) w:0.16
(5,3) w:0.32
(4,2) w:0.38
(7,6) w:0.17
(5,1) w:0.19
(4,7) w:0.26
(5,7) w:0.36
(5,6) w:0.29
(7,1) w:0.34
(0,7) w:0.4
(6,0) w:0.52
(0,4) w:0.58
(0,2) w:0.93

¹ of course quietly assuming you removed/silenced the debug assertion assert(nv == num_vertices(g));

² this immediately answers "why is the vertex_descriptor of the first element added 8 and not 0"

³ the only non-standard bit is my NL istream manipulator which is just a shorthand to discard the rest of the line. It's actually not required given the sample input because newlines are just whitespace. For fun: look at this slightly more complicated input https://coliru.stacked-crooked.com/a/d86be97eced3b2a1 (input from here)