Search code examples
pythongraphjupyter-notebookgraph-visualization

Graphistry: how to color nodes using values from a column


I have a dataframe having columns

Source      Target     Label_S  Weight
car         airplane     0.5       0.2
car         train        0.5       0.5
car         bike         0.5       0.2
bike        motorbike    1       0.7
bike        car          1       0.2
airplane    car          -1      0.2
train       car          1       0.5
motorbike   car          1       0.7

just to give an example. Label_S is the label associated with the source. There are approximately 30000 nodes and 58000 edges. I am using Graphistry to create the network. Everything works great: I am using Jupiter notebook for working on it.

import graphistry
graphistry.register(api=3, protocol="https", server="hub.graphistry.com", username="", password="")   

g = graphistry.bind(source="Source", destination="Target")
g.edges(net).plot(as_files=False)

I would like to assign colors to nodes based on their Label (dtype=float64). The mapping should be:

-1.0:   red
0.5:    black
1:    yellow
2:    blue

for all the others orange. Also, I would like to color edges between two nodes based on the gradient of their label color, if possible.

Following the manuals online, I have tried first to color nodes as follows, after converting labels values from float64 to object using astype(str):

g2 = (g
      .nodes(net, 'Source')

      .encode_point_color('Source', categorical_mapping={
          '-1.0': 'red',
          '0.5': 'black',
          '1.0': 'yellow'
          
      }, default_mapping='orange')

When I run the code g2.edges(df).plot(as_files=False) nothing has changed: the colors are as in the default and not as per my settings.

Do you know how to color nodes and edges from Jupyter Notebook? I have a free plan.


Solution

  • The problem here it's that you use the wrong column in the function encode_point_color:

    .encode_point_color('Source', categorical_mapping={
    

    You need to use the column with the values, here I imagine it's Label_S :

    import pandas as pd
    dict_val = [
        {'Source': 'car', 'Target': 'airplane', 'Label_S': 0.5, 'Source': 0.2},
        {'Source': 'car', 'Target': 'train', 'Label_S': 0.5, 'Source': 0.5}
    ]
    net = pd.DataFrame.from_dict(dict_val)
    
    g2 = (g
          .nodes(net, 'Source')
          .encode_point_color('Label_S', categorical_mapping={
              -1.0: 'red',
              0.5: 'black',
              1.0: 'yellow'
          }, default_mapping='orange'))
    g2.edges(net).plot(as_files=False)