Search code examples

Sankey Diagram Blank in R

I am attempting to produce a Sankey plot in R following the example from but I am struggling with troubleshooting my input data. I am not getting any error messages, only a blank plot.

First, my code and data are available at the following link:


Key<-read.csv(file="Cat_Key.csv", header=TRUE)
Sankey_data<-read.csv(file="Sankey_data.csv", header=TRUE)

#Convert to numeric
Sankey_data2 <-, as.numeric))

#Remove any NA values
Sankey_data3 <- Sankey_data2[!$Value), ]

#Convert Key to numeric
Key$Num <- sapply(Key$Num, as.numeric)
#drop the extra Key column
Key1 <- Key[, -ncol(Key)]

#Generate the plot
sankeyNetwork(Links = Sankey_data3, Nodes = Key1, Source = "Source",
              Target = "Sink", Value = "Value", NodeID = "Name", 
              iterations = 32)

I first checked that my indexing was correct and that it was zero-indexed. When I tried adjusting it, I got an error that it was not zero-indexed, so I think I did it right the first time. (This is the only error message I've received, and I'm not getting it with the code above) I modified my input data so that my Source, Target, and Values are all numeric. I made the numbers in Key numeric. I removed any rows which were duplicated. I removed any rows which had an NA value for the column "Value". I tried different numbers of iterations (up to 32). I used a dummy dataset, which worked. I've tried viewing in a new window and viewing in a Zoom window. I'm really not sure what to try next.

Here's what my data looks like:

> str(Sankey_data3)
'data.frame':   178 obs. of  3 variables:
 $ Source: int  0 1 2 3 4 5 6 7 8 9 ...
 $ Value : num  3 3 2 1 4 1 1 1 2 1 ...
 $ Sink  : int  48 48 48 48 48 48 48 49 50 51 ...

> str(Key)
'data.frame':   166 obs. of  3 variables:
 $ Num     : num  0 1 2 3 4 5 6 7 8 9 ...
 $ Name    : chr  "L-/M+:L+/M-" "L-/M+:L+M+" "L-:L-/M+" "L-:M+" ...
 $ Category: chr  "HisCat SP" "HisCat SP" "HisCat SP" "HisCat SP" ...

I'd appreciate any other suggestions for what to try!


  • Here you go. There was an NA in your source column. This was the exact issue:

    # Check if all Source and Sink values have corresponding Num values in Key
    all(unique(c(Sankey_data3$Source, Sankey_data3$Sink)) %in% Key$Num)

    This returned false meaning there were Source values (NA) which were not in keys. It seems like sankeyNetwork does not handle this case and it fails silently.

    Correct Code

    Key<-read.csv(file="Cat_Key.csv", header=TRUE) %>% select(Num, Name) #select important cols
    Sankey_data<-read.csv(file="Sankey_data.csv", header=TRUE) %>% drop_na() #Remove any NA values
    #Generate the plot
    sankeyNetwork(Links = Sankey_data, Nodes = Key, Source = "Source",
                  Target = "Sink", Value = "Value", NodeID = "Name", 
                  iterations = 0)

    The plot looks weird in R, so watch it in the browser!
