Search code examples
pythonimportattributesnetworkxedge-list

Can’t get NetworkX to read my weighted network (+haven’t found a way to import node attibutes from a file)


I don’t have experience with Python/NetworkX. Everything I try is a dead-end. I have Python 2.7 installed on Windows 8. (Took forever to install NetworkX.)

A. I cannot get NetworkX to read my weighted network

I have a text file with the edges and weights, like this:

1 2 2

2 3 1

2 4 1

4 5 4    (…etc.)

I name the file test.edgelist (exactly as I’ve seen in many examples) and then used this code to read it:

import networkx as nx
fh=open("test.edgelist", 'rb')
G=nx.read_weighted_edgelist(fh, nodetype=int)
fh.close()

I get the following error message:

'module' object has no attribute 'read_weighted_edgelist'

(note: for the unweighted version with just the first two columns, using the same code, only with read_edgelist instead of read_weighted_edgelist, it’s working just fine)

And by using this alternative code:

G = nx.read_edgelist("test.edgelist", nodetype=int, data=(("weight",float),))

I get the following error message:

read_edgelist() got an unexpected keyword argument 'data'

B. Can't find a way to read some node attributes from a file.

The text file will be something like:

Label Sex Country Colour

1 F GB green

2 M DE red

3 M IT blue (…etc.)

I found this, which I think is the only remotely relevant to what I’m looking for:

Reading nodes with pos attribute from file in networkx

Although csv format is not the problem in my case, I took a shot and installed pandas, but all I get is errors:

from pandas.io.api import *

from pandas.io.gbq import read_gbq

import pkg_resources
ImportError: No module named pkg_resources

Solution

  • A.

    if your data is in a text file, then you need to open it as text rather than binary.

    import networkx as nx
    fh=open("test.edgelist", 'r')
    # ------------------------|----- note 'r' not 'rb'
    G=nx.read_weighted_edgelist(fh, nodetype=int)
    fh.close()
    

    With the sample data that you provided, both methods work fine for me. It is particularly surprising that the second command does not work, and makes me wonder whether you have overwritten a built-in (see e.g. How to stop myself overwriting Python functions when coding?).

    I am using networkx version 1.6. (You can test this by typing nx.__version__ in an interactive shell)

    B.

    Pandas is quite flexible in reading data - it doesn't have to be comma separated (even with the read_csv function). For instance, assuming your the second labelled dataset is in a file "data.txt",

    import pandas as pd
    df = pd.read_csv('data.txt', sep='\s')
    
    In [41]: print df
       Label   Sex Country Colour
    0      1     F      GB  green
    1      2     M      DE    red
    2      3     M      IT   blue
    3    NaN  None    None   None
    

    With this data, you can construct a graph whose nodes take on the properties:

    # a new empty graph object
    G2 = nx.DiGraph()
    # create nodes with properties from dataframe (two examples shown, but any number
    # of properties can be entered into the attributes dictionary of each node)
    for idx, row in df.iterrows():
        G2.add_node(row['Label'], colour=row['Colour'], sex=row['Sex'])
    
    # construct a list of colors from the nodes so we can render with that property
    colours = [ d['colour'] for n, d in G2.nodes(data=True)]
    
    nx.draw_networkx(G2, node_color=colours)
    

    I'm not entirely sure why you need pkg_resources (it doesn't seem to be used in the answer that you linked), but see No module named pkg_resources for how to solve the error.