Search code examples
rggplot2pngdifference

How to get the difference of 2 images to visually compare complex-data visualizations by substracting ggplot's images


I am working with two giant datasets taken under similar circumstances and produces very similar, but non-equal results. I need to visually highlight differences between both.

I previously built a heatmap helper function that is able to draw each dataset, but is too large too to post it in here, so I am posting a sample dataset.

I want to take 2 results of ggplot's geom_raster, and subtract one to the other to highlight pixel's differences on Rstudio. Some very basic bitmap images manipulation.

The same image-subtraction approach would be applied to apply to stored images.

In order to show you sample result, I managed to do it with mutate. But old-datasets are not always available.

The goal of code below was to store a ggplot into variables as png, which I think I did. Then subtract 2 of them, which being very similar images, the result would highlight the difference.

I have no intention on going into advanced image manipulation, just to highlight singular differences.

mutate solution

df    <- expand.grid( x=0:100, y=LETTERS[1:25] )
df$z  <- sample( 1:20, nrow(df), replace=TRUE )
df$z2 <- df$z
df$z2[ c(1500,2000,200) ] <- 30

p1 <- ggplot(df , aes(x, y, fill = z)) + geom_raster(); p1
p2 <- ggplot(df, aes(x, y, fill = z2)) + geom_raster(); p2

df %>% mutate( zdiff = abs(z2-z) ) %>% 
  ggplot( aes(x, y, fill = zdiff)) + geom_raster()

produces

p1 p2 pdiff

Difference on 2 ggplot's object converted into png

df    <- expand.grid( x=0:100, y=LETTERS[1:25] )
df$z  <- sample( 1:20, nrow(df), replace=TRUE )
df2   <- df
df2$z[ c(1500,2000,200) ] <- 30

png1 = magick::image_graph(width=480, height=640, res=150)
  p1 <- ggplot(df , aes(x, y, fill = z)) + geom_raster(); p1
dev.off()

png2 = magick::image_graph(width=480, height=640, res=150)
  p2 <- ggplot(df2, aes(x, y, fill = z)) + geom_raster(); p2
dev.off()

diff = png2 - png1
# Error in png2 - png1 : non-numeric argument to binary operator

2nd additional question:

While trying to make sample data, How do you change chunks of values to add noise? I tried the following and I couldn't make any to work. It seems possible to read in that way, but not to assign.
df2[ x == 'C':'G' & y == 20:40 ] <- 30
df2[ df2$y == 'C',]$z[ 20:30 ] <- 30


Solution

  • The legends are different, so it might be clearer to crop off the right 50 pixels for a comparison:

    library(magick)
    image_composite(png1 %>% magick::image_crop("430x640-50+0"), 
                    png2 %>% magick::image_crop("430x640-50+0"), 
                    operator = "subtract")
    

    enter image description here

    You could also combine that with the first image again to see the difference in context:

    image_composite(
      png1,
      image_composite(png1 %>% magick::image_crop("430x640-50+0"), 
                      png2 %>% magick::image_crop("430x640-50+0"), 
                      operator = "subtract"),
      operator = "blend")
    

    enter image description here