Search code examples
rdataframeconditional-statementsr-colnames

Decision based on the headers of a data frame


I have a list of data frames :

df
[[1]]
    ID SignalIntensity       SNR
1  109        6.182309 0.8453577
2  110       10.172777 4.3837078
3  111        7.292275 1.0725751
4  112        8.898467 2.3192185
5  113        9.591034 3.7133402
7  116        7.789323 1.3636656
8  117        7.194835 1.1349738
9  118        6.572773 0.9041846
11 120        9.371126 2.9968457
12 121        6.154944 0.7777584

[[2]]
    ID SignalIntensity       SNR
1  118        6.572773 0.9041846
2  119        5.377519 0.7098581
3  120        9.371126 2.9968457
4  121        6.154944 0.7777584
5  123        5.797446 0.7235425
6  124        5.573614 0.7019574
7  125        7.014537 0.3433343
8  126        6.089159 0.7971650
9  127        6.314820 0.7845944
10 131        5.342544 1.2300000

It has headers as ID SignalIntensity and SNR. I check the headers by names(df[[1]]). Now after checking the headers i need to take the decision such as if headers of df[[1]] are ID,SingnalIntensity and SNR then do something like

    If(names(df[[1]]=="ID"))
    {
    print("This is data from Illumina platform")

    my code..........
    } 
    else if{my code...........}

Here As You know it has three headers.

I know my way of doing is wrong like below trail

if(names(df[[1]]=="ID, SignalIntensity, SNR")), it give me
Error in if (names(df[[1]] == "ID, SignalIntensity, SNR")) { : argument is of length zero Which is quite obvious.

How to set if{} so that it matches all the three headers or(header of our choice either 1 r 2 r 3) and go to other code if true, otherwise do something else. Thanks


Solution

  • Expanding on my comments, try this:

    #dummy data
    df <-
      list(
        data.frame(ID=1:5,
                   SignalIntensity=runif(5),
                   SNR=runif(5)),
        data.frame(ID=1:3,
                   x=runif(3)),
        data.frame(ID=1:5,
                   SignalIntensity=runif(5),
                   SNR=runif(5)))
    
    #check 1st data frame
    if(length(intersect(names(df[[1]]),c("ID","SignalIntensity","SNR")))==3){
      print("Illumina platform")} else {
        print("Non Illumina platform")}
    # [1] "Illumina platform"
    
    #check all dataframes
    lapply(df,function(i)
      if(length(intersect(names(i),c("ID","SignalIntensity","SNR")))==3){
        "Illumina platform"} else {
          "Non Illumina platform"})
    # [[1]]
    # [1] "Illumina platform"
    # 
    # [[2]]
    # [1] "Non Illumina platform"
    # 
    # [[3]]
    # [1] "Illumina platform"