Search code examples
rggplot2geom

Not smooth density plot using ggplot2


When I try to plot the density of some numerical data either using geom_density() or stat_density(), I get a non-smooth curve. Using adjust do not change this. enter image description here

Here I've used facet_zoom(), but also coord_cartesian(xlim = c(...)) produces this non-smooth curve. Pretty weird in my opinion. Any suggestions what's going on?

https://drive.google.com/file/d/1PjQp7XkY5G21NoIo8y8lyeaXKvuvrqVk/view?usp=sharing

Edit: I have uploaded 50000 rows of the original data. To reproduce the plot (not using ggforce), use the code:

data <- read.table("rep.txt")

( 
  ggplot(data, aes(x = x))
  + geom_density(adjust = 1, fill = "grey")
  + coord_cartesian(xlim = c(-50000,50000))
  + labs(x = "", y = "")
  + theme_bw()
)


Solution

  • I reproduced your code but was unable to reproduce the exact image in your original question. Are you concerned about the lack of smoothness at the very tip of the geom_density plot? There are other arguments you can try like kernel and bw, but the sheer number of zeroes in your data will make it hard to achieve a smooth curve (unless you ramp up your adjust value).

    library(tidyverse)
    options(scipen = 999999)
    
    # https://stackoverflow.com/questions/33135060/read-csv-file-hosted-on-google-drive
    id <- "1PjQp7XkY5G21NoIo8y8lyeaXKvuvrqVk" # google file ID
    data <- read.table(sprintf("https://docs.google.com/uc?id=%s&export=download", id)) %>%
      rownames_to_column(var = "var")
    
    ggplot(data, aes(x = x)) + 
      geom_density(
        adjust = 10, 
        fill = "grey", 
        kernel = "cosine",
        bw = "nrd0") + 
      coord_cartesian(xlim = c(-50000,50000)) + 
      labs(x = "", y = "") + theme_bw()
    

    enter image description here

    # I didn't export images for these, but they showcase how many zeroes you have
    ggplot(data, aes(x = x)) + 
      geom_histogram(bins = 1000) +
      coord_cartesian(xlim = c(0,50000)) + 
      labs(x = "", y = "") + theme_bw()
    
    ggplot(data, aes(x = x)) + 
      geom_freqpoly(bins = 1000) +
      coord_cartesian(xlim = c(0,50000)) + 
      labs(x = "", y = "") + theme_bw()