Fast calculation of eigenvector centrality takes too long in networkx

I am using networkx to calculate eigenvector centrality. The problem is that it takes way too long (already running for about 6 hours). Is there a faster way to get the results?

There are around 200,000 nodes in the graph and 60,000,000 edges.

Solution

By looking at the source code, networkx.algorithms.centrality.eigenvector uses the power method to find the leading eigenvector.

If you stick to networkx use this as Joel noticed:

eigenvector_centrality_numpy

centrality = nx.eigenvector_centrality_numpy(G)

Alternatively:

You can use scipy.sparse.linalg.eigs that uses the ARPACK and request only 1 eigenvector to be returned.

Toy example:

import scipy.sparse as sparse

X = np.array() # dimensions 200000 by 200000 as the adjacency
# Note: k=1 and you request the Largest real.
vals, vecs = sparse.linalg.eigs(X, k=1, which='LR')

In any case, 2000000 by 200000 is big and depending on the sparsity and the nature of the matrix, the algorithm may take long. You will also need a lot of CPU and RAM.

Additional tip for networkx.algorithms.centrality.eigenvector:

If you stick with networkx try to relax the tolerance:

eigenvector_centrality(G, max_iter=100, tol=1e-06, nstart=None, weight=None)

Try setting tol=1e-04 or even tol=1e-03