Search code examples
rsankey-diagramnetworkd3

Only show selected labels in Sankey networkd3 diagram


the following image shows one of my sankey diagrams:

Sankey Diagram with a lot of labels

As you can see, I do have a lot of labels shown. I know I can disable the labels completely, however, I am interested to know if I can also choose only specific labels I would like to display (either based on ID of the node or the value of path traversals or on something else). Has anyone played around with that yet and could give me a hint?


Solution

  • The node labels come from the NodeID's in the Nodes data frame that you pass to the function. Any of those ID/labels can be a blank string "", which will effectively make the label for the associated node/s not be visible.

    Using the help file's example, this will print a label with each node...

    URL <- paste0('https://cdn.rawgit.com/christophergandrud/networkD3/master/JSONdata/energy.json')
    energy <- jsonlite::fromJSON(URL)
    
    sankeyNetwork(Links = energy$links, Nodes = energy$nodes, Source = 'source',
                  Target = 'target', Value = 'value', NodeID = 'name',
                  units = 'TWh', fontSize = 12, nodeWidth = 30)
    

    Whereas this will not print a label for the top-left three 'Oil' nodes...

    URL <- paste0('https://cdn.rawgit.com/christophergandrud/networkD3/master/JSONdata/energy.json')
    energy <- jsonlite::fromJSON(URL)
    
    energy$nodes[37:39, ] <- ' '
    
    sankeyNetwork(Links = energy$links, Nodes = energy$nodes, Source = 'source',
                  Target = 'target', Value = 'value', NodeID = 'name',
                  units = 'TWh', fontSize = 12, nodeWidth = 30)
    

    Be careful however when the labels/NodeIDs do not match the Source and Target ids in the Links data frame... when that is the case (as in the examples above) the sankeyNetwork() function relies on the order of the ids in the data frames to associate them.