Search code examples
apache-sparkgraphspark-graphx

Does Spark Graphx have visualization like Gephi


I am new to graph world. I have been assigned to work on graph processing. Now I know Apache Spark, so thought of using it Graphx to process large graph. Then I came across Gephi provides nice GUI to manipulate graphs.

Does Graphx have such tools or it is mainly parallel graph processing library. Can I import json graph data came from Gephi into graphx?


Solution

  • Adding to that you can as well try Graphlab https://dato.com/products/create/open_source.html

    It directly support Spark RDD https://dato.com/learn/userguide/data_formats_and_sources/spark_integration.html

    Not much work required after that

    from pyspark import SparkContext
    import graphlab as gl
    
    sc = SparkContext('yarn-client')
    
    t = sc.textFile("hdfs://some/large/file")
    sf = gl.SFrame.from_rdd(t)
    
    # do stuff...
    
    out_rdd = sf.to_rdd(sc)