I want to design a grouped barplot for data arranged as follows
sx1pre sx1post sx2pre sx2post
1 1 1 1 0
2 1 0 1 0
3 0 1 1 0
4 1 0 0 1
5 1 0 1 0
6 1 0 1 0
I want to compare for each sx (1 or 2) the frequency of"pre" and "post", in a single graph. I would like to graphically represent the percentages of patients showing one symptom (sx) before the operation (pre) , over the total, versus the the ones that show the same symptom after (post). Thanks
Reading it again, I think I know, what you want to achieve. I guess you have the data already in R?
df=read.delim("temp.csv") #data is now in df
frequencies=data.frame(lapply(df,FUN=function(x){sum(x)/length(x)})) #calculate percentages
frequencies=data.frame(t(frequencies)) #make long form of data frame
names(frequencies)="percentage" #rename column
frequencies$category=row.names(frequencies) #get "proper" metadata
frequencies$timepoint=ifelse(grepl("pre",frequencies$category),"pre","post") #get timepoint
frequencies$intervention=ifelse(grepl("sx1",frequencies$category),"sx1","sx2") #get intervention type
#plot
ggplot(frequencies,aes(x=intervention,y=percentage,fill=timepoint))+
geom_col(position=position_dodge())
Regarding the disease-conditions, it might be easier to use something like this:
new_names_after_comment=c('PAIN.PO','DYSPNEA.PO','PAIN.FU','DYSPNEA.FU')
frequencies$category_new=new_names_after_comment #just add as a new column
library(tidyr)
frequencies=frequencies %>%
separate(category_new,into=c("Disease","Timepoint"),sep="\\.",remove = F)
#plot after comment
ggplot(frequencies, aes(x=Disease,y=percentage,fill=Timepoint))+
geom_col(position = position_dodge())