I'm trying to visualize a 3 level subset of my data in one figure for two different treatments.
I want to visualize the distribution of age for only 1 year (2007), for only one item (tattoo), and for females and males separately.
I am able to reduce my dataset to only females, only in 2007, and only for tattoos using:
with(data[(data$sex=="F") & (data$yy=="2007") & (data$item=="tattoo"),], plot(age, xlab="Age of Females", ylab="Frequency"))
With this code, I am able to see a frequency distribution of my data.
But, I am unable, using that code, to do two things:
visualize the data as a density plot
superimpose the multiple tier subset for males
The closest I've been able to come is using this code:
sm.density.compare(age, sex, xlab="Age (years)")
legend(50,0.12, c("Female","Male"), col=c("red", "green"), pch=c(16,16), title="Sex", box.lty=0)
But, with this code, I am unable to get the data to be restricted to the year 2007 and only tattoos.
My question is two fold:
Is it possible to superimpose the male data (for 2007 and tattoos) on the female frequency data?
How can I restrict the density data to only 2007 and tattoos?
I have made a subset of my data available here.
UPDATE: For the frequency histogram, I am trying to visualize the data with the bars for female and male adjacent to each other for each bin.
With standard R plotting you can do as follows
with(data[(data$sex=="F") & (data$yy=="2007") & (data$item=="tattoo"),], plot(density(age)))
with(data[(data$sex=="M") & (data$yy=="2007") & (data$item=="tattoo"),], lines(density(age), col = "red"))
segments(50,0.1,52,0.1, col = "black")
text(52,0.1, pos = 4, labels = "Female")
segments(50,0.09,52,0.09, col = "red")
text(52,0.09, pos = 4, labels = "Male")
A smooth alternative is to use ggplot2 and the easyGgplot2 package by kassambara
my.subset <- data[(data$yy=="2007") & (data$item=="tattoo"),]
ggplot2.histogram(data=my.subset, xName='age',binwidth = 2,
groupName='sex', legendPosition="top",
alpha=0.5, position="identity")