Search code examples
pythonpandasplotdata-analysis

How to use Python to Network Plot / Analysis from pandas DataFrame


I am quite new to data analysis in python so this is more of a quesition for advice as opposed to a specific problem. I have some data that is grouped with categories:

print(df):

Week       Sales          1.          2.            3.    
1          15.        Apple.        Orange.      Pear. 
1          5.         Banana.       Apple.       Orange. 
1          7.         Banana.       Orange.      Pear. 
1          9.         Apple.        Apple.       Pear. 
2          10.        Banana.       Orange.      Apple. 
2          6.         Apple.        Orange.      Pear. 
2          1.         Banana.       Orange.      Apple. 
2          12.        Apple.        Orange.      Apple. 

I was hoping to try and visualise the connection between fruits and sales in some sort of network plot such as:

enter image description here

(picture from google search: https://www.google.com/url?sa=i&url=https%3A%2F%2Fgraph-tool.skewed.de%2F&psig=AOvVaw2-gzUVZ3DzzdcigJYnV3bH&ust=1590452104665000&source=images&cd=vfe&ved=0CAMQjB1qFwoTCMiZr_ndzekCFQAAAAAdAAAAABAD)

Does anyone know where to start with this? and whether it can be done?

Thanks very much!


Solution

  • I suggest you begin by reading about NetworkX(https://networkx.github.io/), a Python library for network analysis, to understand the initial modeling/visualizing of the network/graph. There are great tutorials on the NetworkX page. Following that, I suggest visiting this page (https://github.com/briatte/awesome-network-analysis) to find an array of quality tools for network analysis in Python.

    There are a variety of great tools for this. I recently used it in a network analysis of the US flight network here: (https://jarredparrettdickinson.github.io/applied/empirical/analysis/of/data/Network-Analysis-of-Flights/)