Search code examples
rhistogramlapplyaxis-labels

Changing the labels in histogram when using lapply() or walk() to produce histogram


I am trying to create a matrix of several histograms using lapply() or walk() - from the purrr-package.

This is a fabricated version of my data set including only 5 of 11 columns and 3 of about 100 rows:

pid gender Rand BP GH VT
1 F D 5 7 5
2 M A 6 10 5
3 F D 0 30 5

This is the code I'm using and were I would like to add something to change the x-label depending on the i-value.

    x <- datf #dataframe
    u <- x[,4:11]
    par(mfrow=c(2,4)) 
        walk(x[,4:11], 
                 function(i) 
                      {hist(i[x$rand=="D"], 
                            col=rgb(0,0,1,0.2), 
                            main = "Histogram of score",
                            ylim=c(0,100))
                        hist(i[x$rand=="A"], 
                            col=rgb(1,0,0,0.2), 
                            add=TRUE)})

Instead of walk() I have used lapply() - but to hide the output in the Rmarkdown document changed to walk().

I have tried to use xlab = paste(colnames(i)) and xlab = paste(colnames(u)), after reading similar questions; Using lapply on a dataframe to create histograms with labels and Labels for histogram, when using “lapply”

The xlab = paste(colnames(u)) is the closest but the x-label in the histogram is not the right one rather a list of all of them. Please see the image. Image



However, when I'm creating a similar histogram but of only one set of data in the histogram, i.e. not including hist(i[x$rand=="A"], col=rgb(1,0,0,0.2), add=TRUE). It works fine.

mapply(hist, as.data.frame(x[,4:11]), main=colnames(x[,4:11]), xlab="score")

I created a example dataset, that in it´s form looks like mine, see code.

 Library("dplyr")
    datf <- data.frame(cbind(sample(0:100,size=150, replace=T), 
                             sample(0:100,size=150, replace=T), 
                             sample(0:100,size=150, replace=T),
                             sample(0:100,size=150, replace=T),
                             sample(0:100,size=150, replace=T),
                             sample(0:100,size=150, replace=T),
                             sample(0:100,size=150,replace=T),
                             sample(0:100,size=150, replace=T)))
    datf$rand <- sample(c("D","A"),150, replace=T, prob=c(0.45,0.45))
    datf$pid <- sample(1:150, replace=F, size=150)
    datf$gender <- sample(c("F","M"),150, replace=T, prob=c(0.35,0.65))
    datf <- datf%>%
      rename(
        BP=X1,
        GH=X2,
        VT=X3,
        MH=X4, 
        SF=X5, 
        PF=X6,
        RP=X7,
        RE=X8
      )
    datf <- datf[, c("pid","rand","gender", "BP", "GH","VT","MH", "PF" , "RP", "RE","SF")]

And dput()

structure(list(pid = c(108L, 54L, 75L, 2L), rand = c("A", "A", 
"A", "A"), gender = c("M", "M", "F", "M"), BP = c(70L, 13L, 27L, 
66L), GH = c(2L, 68L, 61L, 19L), VT = c(57L, 68L, 30L, 0L), MH = c(65L, 
69L, 21L, 47L), PF = c(100L, 38L, 70L, 60L), RP = c(77L, 27L, 
59L, 38L), RE = c(66L, 9L, 68L, 48L), SF = c(30L, 74L, 64L, 20L
)), row.names = c(NA, 4L), class = "data.frame")

This is how I would like the output to look like: See image here

Would it be easier to use ggplot? - But then how?

Thank you in advance!


Solution

  • Maybe something like this is closer to what you are looking for?

    library(tidyverse)
    
    datf %>%
      pivot_longer(cols = BP:SF) %>%
      ggplot() + aes(value, fill = rand) + 
      geom_histogram() + facet_wrap(~name) 
    

    enter image description here