related to my work (here Using a generic type with an adjacency_list) I'm now testing the execution of a simple code that does the following:
Here's the code:
#include "Common.h"
#include "GraphFileReader.h"
#include "GraphNeighbors.h"
#include <boost/graph/metis.hpp>
#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <time.h>
int main(int argc, char *argv []){
// Start mpi enviroment
boost::mpi::environment env(argc, argv);
boost::mpi::communicator world;
// Create the graph
GraphFileReader *graphFileReader;
undirectedAdjacencyList graph;
if(process_id(graph.process_group()) == 0){
// Load the graph's path
graphFileReader = new GraphFileReader(argv[1]);
// Read the graph file and adds the vertices and edges
graphFileReader->loadGraph(graph);
}
// Wait until the process 0 has finished loading the graph
world.barrier();
synchronize(graph.process_group());
GraphNeighbors graphNeighbors;
// Now each machine should process it's own graph piece
graphNeighbors.countEdges(graph);
graphNeighbors.clusteringCoefficient(graph);
// Wait for the other processes before finishing
world.barrier();
synchronize(graph.process_group());
cout << "\n process: " << world.rank() <<" finishing\n" << std::endl;
And here's the result:
graphs: /usr/include/boost/graph/distributed/adjacency_list.hpp:2679:
std::pair<typename boost::adjacency_list<OutEdgeListS, boost::distributedS<ProcessGroup,
InVertexListS, InDistribution>, DirectedS, VertexProperty, EdgeProperty, GraphProperty,
EdgeListS>::out_edge_iterator, typename boost::adjacency_list<OutEdgeListS,
boost::distributedS<ProcessGroup, InVertexListS, InDistribution>, DirectedS,
VertexProperty, EdgeProperty, GraphProperty, EdgeListS>::out_edge_iterator>
boost::out_edges(typename boost::adjacency_list<OutEdgeListS,
boost::distributedS<ProcessGroup, InVertexListS, InDistribution>, DirectedS,
VertexProperty, EdgeProperty, GraphProperty, EdgeListS>::vertex_descriptor, const
boost::adjacency_list<OutEdgeListS, boost::distributedS<ProcessGroup, InVertexListS,
InDistribution>, DirectedS, VertexProperty, EdgeProperty, GraphProperty, EdgeListS>&) [with
OutEdgeListS = boost::vecS, ProcessGroup = boost::graph::distributed::mpi_process_group,
InVertexListS = boost::vecS, InDistribution = boost::defaultS, DirectedS =
boost::undirectedS, VertexProperty = Node, EdgeProperty = boost::no_property, GraphProperty
= boost::no_property, EdgeListS = boost::listS]: Assertion `v.owner == g.processor()' failed.
_________________________________________________________________
I'm process: 0
I'm process: 1
Number of edges: 4
0.37694 milliseconds
Number of edges: 2
0.16284 milliseconds
rank 1 in job 1 compute-1-4_49342 caused collective abort of all ranks
exit status of rank 1: killed by signal 6
_________________________________________________________________
Epilogue Args:
Job ID: 138573.tucan
User ID: ***
Group ID: ***
Job Name: mpiGraphs.job
Resource List: 5746
Queue Name: ncpus=1,neednodes=2:ppn=2,nodes=2:ppn=2
Account String: cput=00:00:00,mem=420kb,vmem=13444kb,walltime=00:00:02
Date: Thu Mar 1 14:28:19 CET 2012
_________________________________________________________________
On the other hand, the execution with only one machine works perfectly:
I'm process: 0
Number of edges: 6
8.46696 milliseconds
The network average clustering coefficient is: 0.53333
0.12708 milliseconds
process: 0 finishing
My tutor and I thought that this could be because one machine ends while the other is still performing it's operations, so we add the synchronize and the barrier (I actually don't know the difference between both, so I tested a few combinations with same results).
If you need the rest of the code (Common.h, GraphFileReader or GraphNeighbors) I could upload it and post the link here to avoid an huge post.
Since you are thinking about synchronization errors, I will simplify the error message that you are getting:
graphs: (boost)adjacency_list.hpp:2679: boost::out_edges(vertex_descriptor v, adjacency_list& g): Assertion `v.owner == g.processor()' failed.
exit status of rank 1: killed by signal 6
Signal 6 is triggered by abort()
, which in turn is triggered by the assertion failure above.
I don´t know anything about this graph library, but according to adjacency_list.hpp it seems that your processor 1 is calling out_edges
and passing a vertex v
that belongs to processor 0.