I am trying to create a matrix of several histograms using lapply() or walk() - from the purrr-package.
This is a fabricated version of my data set including only 5 of 11 columns and 3 of about 100 rows:
pid | gender | Rand | BP | GH | VT |
---|---|---|---|---|---|
1 | F | D | 5 | 7 | 5 |
2 | M | A | 6 | 10 | 5 |
3 | F | D | 0 | 30 | 5 |
This is the code I'm using and were I would like to add something to change the x-label depending on the i-value.
x <- datf #dataframe
u <- x[,4:11]
par(mfrow=c(2,4))
walk(x[,4:11],
function(i)
{hist(i[x$rand=="D"],
col=rgb(0,0,1,0.2),
main = "Histogram of score",
ylim=c(0,100))
hist(i[x$rand=="A"],
col=rgb(1,0,0,0.2),
add=TRUE)})
Instead of walk() I have used lapply() - but to hide the output in the Rmarkdown document changed to walk().
I have tried to use xlab = paste(colnames(i))
and xlab = paste(colnames(u))
, after reading similar questions; Using lapply on a dataframe to create histograms with labels and Labels for histogram, when using “lapply”
The xlab = paste(colnames(u))
is the closest but the x-label in the histogram is not the right one rather a list of all of them.
Please see the image.
Image
However, when I'm creating a similar histogram but of only one set of data in the histogram, i.e. not including hist(i[x$rand=="A"], col=rgb(1,0,0,0.2), add=TRUE)
. It works fine.
mapply(hist, as.data.frame(x[,4:11]), main=colnames(x[,4:11]), xlab="score")
I created a example dataset, that in it´s form looks like mine, see code.
Library("dplyr")
datf <- data.frame(cbind(sample(0:100,size=150, replace=T),
sample(0:100,size=150, replace=T),
sample(0:100,size=150, replace=T),
sample(0:100,size=150, replace=T),
sample(0:100,size=150, replace=T),
sample(0:100,size=150, replace=T),
sample(0:100,size=150,replace=T),
sample(0:100,size=150, replace=T)))
datf$rand <- sample(c("D","A"),150, replace=T, prob=c(0.45,0.45))
datf$pid <- sample(1:150, replace=F, size=150)
datf$gender <- sample(c("F","M"),150, replace=T, prob=c(0.35,0.65))
datf <- datf%>%
rename(
BP=X1,
GH=X2,
VT=X3,
MH=X4,
SF=X5,
PF=X6,
RP=X7,
RE=X8
)
datf <- datf[, c("pid","rand","gender", "BP", "GH","VT","MH", "PF" , "RP", "RE","SF")]
And dput()
structure(list(pid = c(108L, 54L, 75L, 2L), rand = c("A", "A",
"A", "A"), gender = c("M", "M", "F", "M"), BP = c(70L, 13L, 27L,
66L), GH = c(2L, 68L, 61L, 19L), VT = c(57L, 68L, 30L, 0L), MH = c(65L,
69L, 21L, 47L), PF = c(100L, 38L, 70L, 60L), RP = c(77L, 27L,
59L, 38L), RE = c(66L, 9L, 68L, 48L), SF = c(30L, 74L, 64L, 20L
)), row.names = c(NA, 4L), class = "data.frame")
This is how I would like the output to look like: See image here
Would it be easier to use ggplot? - But then how?
Thank you in advance!
Maybe something like this is closer to what you are looking for?
library(tidyverse)
datf %>%
pivot_longer(cols = BP:SF) %>%
ggplot() + aes(value, fill = rand) +
geom_histogram() + facet_wrap(~name)