r statistics tidyverse percentage mutate

R calculate percentage of unique instances over total sum of different variable

I have a fairly simple statistical task that I'm having trouble with. I need to calculate the topic that has the greatest and least amount of unique instances. The problem is that the topic was not assigned the same number of times, so I think I need to figure out the number of times the topic referred to a unique instance (numUnique) depending on the number of times the topic was coded overall (numCoded).

The df looks like this:

topic	numCoded	numUnique
A	63	52
B	134	91
C	19	16
D	35	35

I tried to calculate the percent change between numCoded, but I'm pretty sure that's not what I need to compute and it spits out NA for the new column anyway:

library(tidyverse)
foo <- propAgree %>%
  group_by(topic) %>%
  mutate(pct_change = (numCoded/lag(numCoded) - 1) * 100)

The expected output would look something like this (NOTE: I'm using dummy percentages here because I don't know how to compute this)

|      topic     |     similarity    |    
|---------------------|------------------|
|          A         |        30%       |   
|          B         |         50%       |        
|          C        |          70%       |    
|          D         |         20%      |

I need to do this for the top and bottom 10 topics, so after calculating the similarity I would then filter for the top and bottom percentage values. Any help would be appreciated.

Solution

Try this code:

prop_Agree %>%
  mutate(pct_change = (numUnique/numCoded) * 100)

It will calculate the percentages of numUnique in each topic Also, if you want them to be ordered, just add

%>% arrange(pct_change)

in the end and use head(10) to extract the bottom 10 and tail(10) to extract the top 10