Search code examples
rsparse-matrix

How I convert this data frame to a binary form?


Good afternoon

I have this data frame

> head(d)
  Gene.Name                  GO.term
1     EPCAM       cell-cell adhesion
2     CDH17            cell adhesion
3    LGALS4            cell adhesion
4    GPRC5A       cell-cell adhesion
5     KRT18       cell-cell adhesion
6      SOX9 cytoskeleton organsation
> 
CGN cell-cell adhesion

> unique(d$GO.term)
[1] cell-cell adhesion       cell adhesion           
[3] cytoskeleton organsation oxidation-reduction     
4 Levels: cell-cell adhesion ... oxidation-reduction
> 

I want something like below where if a gene is in a GO.term that achieves 1 if not 0

> head(d[,1:2])
                             cell adhesion cytoskeleton organsation
AQP9                                          0               1
AXIN2                                         1               0
BCL6                                          1               0
BMP7                                          1               0
C5AR1                                         0               1
CCL2                                          0               1
> 

But I don't know how to do that

Any help please?


Solution

  • Try...

    d$cell.cell.adhesion<-df$Go.Term == “cell-cell adhesion”

    d$organization<-d$Go.Term==“organization”

    Create new column for each within group . Returned value is logical(which you can convert to integer if so desired)

    *will convert all T/F values to integer [1,0]

    #where ‘d’ is your data.frame
    d*1 
    

    (On ipad or id give a larger example) but this should work

    Example Say my dataframe is:

    ColA        ColB 
    A               sun
    B              moon
    

    Now, I want to create a new column(observation) that checks for the presence of a value (either ‘sun’ or ‘moon’)

    mydataframe$NewCol<-mydataframe$ColB==‘sun’

    The updates dataframe contains a new column:

    ColA       ColB      NewCol
    A          sun           TRUE
    B          moon       FALSE