Search code examples
rggplot2geom-bar

don't loss N/A values in ggplot2 and delete outliers


I would like to draw histogram/bar of my data with N/A value. At the moment when I am trying to use ggplot2 all non-finite values removed automaticly. Is any posibility to count how many of them we have and put on plot? I would like to solve this problem for or classes (integer, numeric, character, date ect.)

ggplot(tmp, aes(x = x, y=(..count..)/sum(..count..))) + 
  geom_bar(fill="#003399") + 
  labs(title = "...", x = "Variable Values", y = "Frequency")

I have also secound question.How I can automaticly delete 5% of the lowest and 5% of the highest values (outliers) from ggplot panel? Becouse of that histograms will be much more transpartent.


Solution

  • Maybe this is what you are looking for:

    # Generate a 'toy dataset' with some missing values in y
    set.seed(1234)
    n <- 100
    tmp <- data.frame(x = sample(LETTERS[1:5], n, replace=T),
                      y = rnorm(n))
    tmp$y[sample(1:n,10)] <- NA
    summary(tmp)
    
    
    tmp$miss <- "No missing"
    tmp$miss[is.na(tmp$y)] <- "Missing"
    ggplot(tmp, aes(x = x, y=(..count..)/sum(..count..))) + 
      geom_bar(aes(group=miss, fill=miss), position="stack") + 
      labs(title = "...", x = "Variable Values", y = "Frequency")
    

    enter image description here