I am currently doing a R course and I struggle with knitting an HTML file. All the code works fine within RStudio. The file also knits properly, however it wont plot an output for the last command, when I run the inference. I added the code.
Any input is much appreciated. Thanks Markus
Firstly, we filter for the religions and the year of interest:
```{r filter}
gss2012 = gss %>%
filter(year =="2012")
gssCatPro2012 = gss2012 %>%
filter(relig=="Catholic" | relig=="Protestant")
```
Now we create a first histogram of both religions to get a first idea of the distributions:
{r plot both rel}
ggplot(data=gssCatPro2012, aes(x=childs))+geom_histogram()
Calculate ratio and represent in pie chart:
{r ratio}
gssCatPro2012 %>%
summarise(Catholicratio = sum(relig =="Catholic")/n())
percent <- c(32.64,67.36)
lbls <- c("Catholics", "Protestants")
pct <- round(percent/sum(percent)*100)
lbls <- paste(lbls, pct)
lbls <- paste(lbls,"%", sep="")
pie(percent, labels=lbls, col=rainbow(length(lbls)), main="Pie chart Catholics/Protestants")
Split data between religions:
{r split}
gssCat2012 = gssCatPro2012 %>%
filter(relig=="Catholic")
gssPro2012 = gssCatPro2012 %>%
filter(relig=="Protestant")
Plot first distribution of Catholics, then Protestants:
{r plot per religion}
ggplot(data=gssCat2012, aes(x=childs))+geom_histogram()
ggplot(data=gssPro2012, aes(x=childs))+geom_histogram()
Check if any NAs to clean:
{r NA}
anyNA(gssCatPro2012$childs)
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
gssCatPro2012=completeFun(gssCatPro2012,"childs")
anyNA(gssCatPro2012$childs)
Calculate means for both religions:
{r metrics}
gssCatPro2012 %>%
group_by(relig) %>%
summarise(mean_kids=mean(childs), med_kids=median(childs), sd_kids=sd(childs),n=n())
We are going to create a new variable in order to overwrite the content of the old variable relig:
{create new variable}
gssCatPro2012new <- gssCatPro2012 %>%
mutate(relignew = ifelse(relig == "Catholic", "Catholic", "Protestant"))
Now, we can run the inference function and see whether we can reject the 0 Hypothesis or not:
{hypothesis test}
inference(y = childs, x = relignew, data = gssCatPro2012new, statistic = "mean", type = "ht", null = 0, alternative = "twosided", method = "theoretical")
Modify the chunk names to use underscores instead of spaces and make sure each chunk begins with a leading "r".
For example:
{r create_new_variable}
instead of:
{create new variable}