Search code examples
rsankey-diagram

sankeyNetwork Function - Help required to set Node and Link data frame


I'm trying to create a Sankey Diagram for the following dataset (only categorical variables), but I'm not having luck setting up the sankeyNetwork parameters (target, source, value). Find below my code.
Could you please help me clarify what is wrong here?

node_names <- unique(c(as.character(sk_dataset$Race), as.character(sk_dataset$Gender)))
nodes <- data.frame(name=node_names)


links <- data.frame(source=match(sk_dataset$Gender, node_names) -1,
                   target = match(sk_dataset$Race, node_names) -1,
                   value=c(2,3, 2, 3, 1, 3))

sankeyNetwork(Links=links, Nodes=nodes,Source="source",
             Target="target", Value="value") 


Example of what I want to achieve: Example


Solution

  • library(data.table)
    library(networkD3)
    
    sk_dataset <- fread('sof/DT.csv')
    sk_dataset
    

    sk_dataset is something like that: (I took first 15 rows from image)

    https://i.sstatic.net/cSBRl.png

    Create a frequency table by gender and race.

    t1 <- sk_dataset[,.N,by = c('gender','race')]
    

    t1 frequency table looks like that:

    gender race N
    male black 3
    male white 7
    male hispanic 5
    node_names <- unique(c(as.character(sk_dataset$race), as.character(sk_dataset$gender)))
    nodes <- data.frame(name=node_names)
    
    
    links <- data.frame(source=match(t1$gender, node_names) -1,
                        target = match(t1$race, node_names) -1,
                        value= t1$N)
    
    sankeyNetwork(Links=links, Nodes=nodes,Source="source",
                  Target="target", Value="value") 
    

    Please review for more: https://www.r-graph-gallery.com/322-custom-colours-in-sankey-diagram.html