So I have a dataframe like so:
summary(deg)
L2FC Gene diffexp comp
Min. :-3.825 Length:926 Downregulated:210 A:195
1st Qu.: 1.010 Class :character Upregulated :716 B:731
Median : 1.163 Mode :character
Mean : 0.860
3rd Qu.: 1.431
Max. : 6.505
head(deg)
L2FC Gene diffexp comp
1 -2.754236 SLC13A2 Downregulated A
2 3.161623 SNAI2 Upregulated A
3 -2.821350 STYK1 Downregulated A
4 -1.798022 CD84 Downregulated A
5 -1.293536 TLE6 Downregulated A
6 -1.011016 P2RX1 Downregulated A
What I want is simply the unique gene symbols annotated based on whether they are in A only, B only, or shared across both. Desired output is like this:
Gene comp
1 GENE1 1
2 GENE2 0
3 GENE3 -1
Where the Gene column only has the unique values from deg and the comp shows +1 for belonging to only A, -1 for belonging to only B, or 0 for belonging to both.
Thanks!
Try
library(tidyverse)
deg |>
summarize(
comp = any(comp == "A") - any(comp == "B"),
.by = Gene
)