Search code examples
rfunctionargumentspie-chart

How to make a function to create a customised pie chart? (problems in inputing the argument)


I am new to R (pretty much one week of coding so far). My apologies for the simple question. I am trying to write a simple function that will create a pie chart dependent on the variable of the dataset I want to input.

The dataset is

ID<-c("001","002","003","004","005","006","007","008","009","010","NA","012","013")
Name<-c("Damon Bell","Royce Sellers",NA,"Cali Wall","Alan Marshall","Amari Santos","Evelyn Frye","Kierra Osborne","Mohammed Jenkins","Kara Beltran","Davon Harmon","Kaitlin Hammond","Jovany Newman")
Sex<-c("Male","Male","Male",NA,"Male","Male",NA,"Female","Male","Female","Male","Female","Male")
Age<-c(33,27,29,26,27,35,29,32,NA,25,34,29,26)
data1<-data.frame(ID,Name,Sex,Age)

The code that works without the function is:

#calculation of counts
dSex <- data %>%
  filter(!is.na(Sex)) %>%
  group_by(Sex) %>% 
  summarise(Count = n()) %>%
  mutate(Total = sum(Count), Percentage = round((Count/Total),3))

## Compute the position of labels
dSex <- dSex %>% 
  arrange(desc(Sex)) %>%
  mutate(ypos = cumsum(Percentage)-0.5*Percentage)

dSex %>% 
  ggplot(aes(x="", y=Percentage, fill=Sex)) +
  geom_bar(stat="identity", color="White") +
  coord_polar("y", start=0) +
  geom_text(aes(y = ypos, label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), color = "white") + 
  scale_fill_manual(values = c("#7e0f7e", "#026b6c")) +
  guides(fill = guide_legend(title = "Sex")) +
  theme(
    axis.title = element_blank(), 
    axis.line = element_blank(), 
    axis.text = element_blank() 
  ) 

So I started to write the function and I am already struggling. I would like to input the variable Sex in the name of a new data I would call "dSex" but, I don't think it is working. I put the line with deparse() and substitute() because my understanding is that it helps R to understand that "Var" is the argument of the function. And I also put double "{" "}" but it does not seem to read it!

FctPieChart <- function(dat,Var){
  
  Var <- deparse(substitute({{Var}}))
  
  dVar <- dat %>%
    filter(!is.na({{Var}})) %>%
    group_by(Var) %>% 
    summarise(Count = n()) %>%
    mutate(Total = sum(Count), Percentage = round((Count/Total),3))
}  
FctPieChart(data,Sex)

I get the following error:

Error: Problem with `filter()` input `..1`.
i Input `..1` is `!is.na(c("{", " {", " Sex", " }", "}"))`.
x Input `..1` must be of size 13 or 1, not size 5.
Run `rlang::last_error()` to see where the error occurred.

Do you have any idea on how to do this?

Also another quick question, the pie chart I drew is quite big, is there a way to reduce the size?

Thank you very much in advance, if someone has any idea, this would really help me!

Best regards,

Stephanie

ANSWER BASED ON SUGGESTION BELOW: The answer was to input the variable (here: Sex) in character as an argument of the function and use the function get() whenever I would call the variable in the function. get() would allow me to transform the characters into the variable. However, the use of the funtion get() created a change in the name of the variable: " get(Var) ". I solved the issue by renaming the variable in the data and by creating a copy of the first variable Var as Var2.

I also wanted to get the actual data that stopped being created once it was in a function (it was being created in the code that was not in function). I found that dVar <<- dVar would solve the problem because it would create the data. The data would be called dVar and the name does not depend on the variable inputed (in the old code it was dSex)

Since I wanted the name of the table to be reflective of the variable used (instead of a regular "dVar"), I solved the issue by renaming it by using the function assign() and the envir=.

The corrected code is below:

FctPieChart <- function(dat,Var){

#Create a replacement of  `get(Var)`  to use later
Var2 <-Var #We will use this one later instead of Var since Var is going to be changed into something else
print(Var2)

#Frequencies table
dVar <- dat %>%
filter(!is.na(get(Var))) %>%
group_by(get(Var)) %>% #changes the name of the column to "`get(Var)`"
summarise(Count = n()) %>%
mutate(Total = sum(Count), Percentage = round((Count/Total),3))

#Position calculation for the Pie Chart
dVar <- dVar %>%
arrange(desc(`get(Var)`)) %>%
mutate(ypos = cumsum(Percentage)-0.5*Percentage)   print(dVar)

#Rename the variable that was changed in the data 
colnames(dVar)<-c(Var2,"Count","Total","Percentage","ypos") 
print(dVar)

#Make the graph
Graph <- dVar %>% 
ggplot(aes(x="", y=Percentage, fill=get(Var2))) +#fill=`get(Var)`)  #get(Var2) alloWs to transform character into variable
geom_bar(stat="identity", color="White") +
coord_polar("y", start=0) +
geom_text(aes(y = ypos, label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), color = "white") + 
scale_fill_manual(values = c("#7e0f7e", "#026b6c")) +
guides(fill = guide_legend(title = Var2)) + #title = Var2 takes the value as characters of Var2 
theme(
axis.title = element_blank(), 
axis.line = element_blank(), 
axis.text = element_blank() 
)
print(Graph)

#Create the table
dVar<<-dVar

#Change name of the data based on the variable 
assign(paste0("d",Var2),dVar,envir = parent.frame()) #use envir = parent.frame() to access the environment outside the function, source: https://stackoverflow.com/questions/38296670/r-assign-inside-a-function
rm(dVar,envir = parent.frame())
}
FctPieChart(data,"Sex")

I would like to thank @Silentdevildoll who helped me solve the problem.

Have a good day!

Stephanie


Solution

  • I've never used something like this before, and there is certainly a flaw in my code, but this was my attempt at trying to help you out. I'm sure others will have a better solution, but this is a start at least:

    FctPieChart <- function(dat,Var){
      
      dVar <- dat %>%
        filter(!is.na(get(Var))) %>%
        group_by(get(Var)) %>%
        summarise(Count = n()) %>%
        mutate(Total = sum(Count), Percentage = round((Count/Total),3))
      
      dVar <- dVar %>%
        arrange(desc(`get(Var)`)) %>%
        mutate(ypos = cumsum(Percentage)-0.5*Percentage)
      print(dVar)
      dVar %>% 
        ggplot(aes(x="", y=Percentage, fill=`get(Var)`)) +
        geom_bar(stat="identity", color="White") +
        coord_polar("y", start=0) +
        geom_text(aes(y = ypos, label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), color = "white") + 
        scale_fill_manual(values = c("#7e0f7e", "#026b6c")) +
        guides(fill = guide_legend(title = "Sex")) +
        theme(
          axis.title = element_blank(), 
          axis.line = element_blank(), 
          axis.text = element_blank() 
        ) 
      
    }  
    FctPieChart(data1,"Sex")
    

    I'm not really familiar with the deparse(substitute) that you used, but I know get() is a way to turn a string into a variable, so essentially I replaced the Var in the first paragraph of your function to get(Var). Where I'm sure this isn't ideal however, is that it the group_by(get(Var)) changes the name of the column to get(Var), which I demonstrated by printing the table. Note that I also input the variable as "Sex" instead of just Sex. Like I said, this isn't perfect, but I think this maybe can point you in the right direction.