Search code examples
rdataframeunique

Find unique values in R but annotate them based on another column


So I have a dataframe like so:

summary(deg)
      L2FC            Gene                    diffexp    comp   
 Min.   :-3.825   Length:926         Downregulated:210   A:195  
 1st Qu.: 1.010   Class :character   Upregulated  :716   B:731  
 Median : 1.163   Mode  :character                              
 Mean   : 0.860                                                 
 3rd Qu.: 1.431                                                 
 Max.   : 6.505    

head(deg)
       L2FC    Gene       diffexp comp
1 -2.754236 SLC13A2 Downregulated    A
2  3.161623   SNAI2   Upregulated    A
3 -2.821350   STYK1 Downregulated    A
4 -1.798022    CD84 Downregulated    A
5 -1.293536    TLE6 Downregulated    A
6 -1.011016   P2RX1 Downregulated    A

What I want is simply the unique gene symbols annotated based on whether they are in A only, B only, or shared across both. Desired output is like this:

   Gene comp
1 GENE1    1
2 GENE2    0
3 GENE3   -1

Where the Gene column only has the unique values from deg and the comp shows +1 for belonging to only A, -1 for belonging to only B, or 0 for belonging to both.

Thanks!


Solution

  • Try

    library(tidyverse)
    
    deg |>
      summarize(
        comp = any(comp == "A") - any(comp == "B"),
        .by = Gene
      )