I have a .txt
File that has 46 lines, each line stands for a node in a network and then has LOTS of attributes behind it.
Example Name; 03.01.194, Luzern, (LU), Test, Attribute, Other Attribute,
Kasdasd Alex; 22.12.1957, in Blabla, (ZH), Bürgerorte, Oeschgen (AG), Zivilstand,
I'm not sure how I get networkx to see this as a nodelist, some things I thought about, that maybe could work, but do not at the moment
import pandas as pd
import networkx as nx
nodes = pd.read_csv('final.csv', header=None)
nodes
Problem with the code above is that the attributes are separated by commas, but not the nodes.
Another try, where I wanted to open the file, and add nodes line by line but got stuck on the G.add_node()
command
G = nx.Graph()
with open('final.txt') as infile:
for line in infile:
G.add_node()
Is one of the two the approach to go for or should I try something different?
Also for further analysis, does networkx offer a possibilty to compare attributes of nodes and if they match, create a weighted edge?
You can achieve this by reading the file specifying the delimiter as ';' so that the first element is the node key and the rest are the attributes. Then split the attributes string with the delimiter ',' and add the returned list as a node attribute. I copied the sample you provided in 'test.txt' file and executed the following code.
G = nx.DiGraph()
csv_F = csv.reader(open("test.txt"),delimiter=';')
for row in csv_F:
attributes=row[1].split(',')
G.add_node(row[0], attr = attributes)
Then I printed the nodes and their attributes as follows:
for n in G.nodes():
print 'Node: ' + str(n)
print 'Atrributes' + str(G.node[n]['attr'])
Result:
Node: Kasdasd Alex
Atrributes: [' 22.12.1957', ' in Blabla', ' (ZH)', ' B\xc3\xbcrgerorte', ' Oeschgen (AG)', ' Zivilstand', '']
Node: Example Name
Atrributes: [' 03.01.194', ' Luzern', ' (LU)', ' Test', ' Attribute', ' Other Attribute', ' ']
As for your question in the end, networkx offers such capabilities and more. Have a look on the tutorial here.