Search code examples
rggplot2confusion-matrix

plot confusion matrix as stacked bar chart with ggplot2


I have a confusion matrix that I want to plot as stacked bar chart with ggplot2.

# confusion matrix
conf <- structure(c(3015, 672, 874, 3217, 0.224736436101826, 0.1727950629982
), .Dim = 2:3, .Dimnames = list(c("FALSE", "TRUE"), c("FALSE", 
"TRUE", "class.error")))

conf
#       FALSE TRUE class.error
# FALSE  3015  874   0.2247364
# TRUE    672 3217   0.1727951

I tried reshaping it using tidyr:

conf <- as.data.frame(rf$confusion)
conf$actual <- row.names(conf)
conf <- tidyr::pivot_longer(conf, c(`FALSE`, `TRUE`))
conf$prediction <- conf$name

and then plotting using:

ggplot(conf, aes(x = actual, fill = prediction)) + geom_bar(position = "fill")

actual output:

enter image description here


But there are several issues:

  1. The bars should have the height according to the value column of my confusion matrix
  2. The colors should indicate be green for correctly predicated part and red for incorrectly predicated part

How can I solve this?


Any help also in simplified approaches is appreciated..


Solution

  • By default, geom_bar() does count for stats, meaning it counts the number of TRUE/FALSE, which gives 1:1. So you can use geom_col() or geom_bar(stat="identity") instead

    Try something like this:

    g <- data.frame(conf[,1:2]) %>% 
    tibble::rownames_to_column("observed") %>% 
    pivot_longer(-observed,names_to = "predicted") %>% 
    ggplot() + geom_col(aes(x=observed,y=value,fill=predicted))
    print(g)
    

    plot

    For really red / green:

    #set the colors
    # note you have FALSE. and TRUE. in your matrix
    COLS = c("TRUE."="green","FALSE."="red")
    g + scale_fill_manual(values = COLS)
    

    enter image description here