Search code examples
rdataframeif-statementsubset

Why is my ifelse statement to subset a dataframe not working?


I am trying to create a scrip to automate an analysis. The specific analyte has different channels (channel 1-4) that are used to detect it, but the machine outputs a file with the results from all 4 channels. For my analysis I want to subset based on the channel, so I want to supply that channel at the start of the script so that is all I have to change going forward. But when I use an if else statement to subset the df based on the channel vector it returns a list:

Channel <- "CH1"

head(WG1)
 X img_nr            class cycle id      x      y  x_glb  y_glb width yolk_x yolk_y yolk_width adjust adjust_suppl CH1..yolk_mean
1 0      1 wellA1_col1_row1     1  1 1332.9 1951.6 1332.9 1951.6  43.1 1336.5 1951.8       21.4      1            2           46.3
2 1      1 wellA1_col1_row1     1  2 1774.8 1950.0 1774.8 1950.0  47.9 1775.2 1948.3       24.9      1            2           50.1
3 2      1 wellA1_col1_row1     1  3  708.1 1949.7  708.1 1949.7  47.6  710.0 1949.0       27.2      1            2           45.8
4 3      1 wellA1_col1_row1     1  4  405.6 1948.7  405.6 1948.7  44.9  405.6 1948.7       28.0      1            0           32.8
5 4      1 wellA1_col1_row1     1  5 1832.3 1946.4 1832.3 1946.4  45.9 1834.9 1947.1       22.8      1            2           33.5
6 5      1 wellA1_col1_row1     1  6 2042.6 1946.8 2042.6 1946.8  47.6 2044.6 1946.8       21.5      1            2           46.5
  CH1..yolk_sd CH1..yolk_sat CH1..yolk_mean_hist CH1..yolk_sd_hist CH1..confi CH1..mean_hist CH1..sd_hist CH1..bg_hist CH1..contrast
1        2.261         128.8                45.8             1.442          1           41.2        2.570         25.1          17.1
2        2.200         139.5                49.6             1.378          1           42.0        3.605         26.7          16.7
3        2.650         127.2                45.2             1.640          1           40.3        2.974         26.6          14.9
4        1.943          90.9                32.3             1.218          1           31.5        1.402         22.9           8.9
5        1.847          93.1                33.1             1.227          1           31.2        1.259         24.6           7.1
6        2.454         129.3                46.0             1.559          1           42.3        2.712         26.0          17.4
  CH1..mean CH1..sd CH1..sat CH2..yolk_mean CH2..yolk_sd CH2..yolk_sat CH2..yolk_mean_hist CH2..yolk_sd_hist CH2..confi CH2..mean_hist
1      42.0   3.647    116.0          21.65        1.640         113.9               21.33             1.194          1          15.96
2      43.0   4.956    118.1          21.47        1.538         112.8               21.12             1.038          1          16.42
3      41.1   4.260    113.3          19.79        1.989         103.2               19.32             1.164          1          15.14
4      32.1   2.257     88.5          17.28        1.565          90.2               16.89             1.009          1          13.91
5      31.7   2.093     87.7          21.04        1.380         110.7               20.73             0.884          1          16.38
6      43.5   4.545    119.2          21.71        1.358         114.4               21.42             0.883          1          17.50
  CH2..sd_hist CH2..bg_hist CH2..contrast CH2..mean CH2..sd CH2..sat CH3..yolk_mean CH3..yolk_sd CH3..yolk_sat CH3..yolk_mean_hist
1        1.092         9.02          7.49     16.38   1.669     85.3          24.27        0.903         210.4               24.06
2        1.220        10.27          6.74     16.82   1.783     87.7          24.45        0.885         212.1               24.25
3        1.164         9.49          5.99     15.49   1.701     80.9          23.60        0.942         204.4               23.37
4        1.115         5.14          9.17     14.42   1.884     74.3          23.47        0.953         203.1               23.22
5        1.066         9.70          7.11     16.77   1.615     87.5          24.50        0.924         212.4               24.29
6        1.156        10.78          7.11     17.99   1.929     93.5          24.92        0.942         216.1               24.70
  CH3..yolk_sd_hist CH3..confi CH3..mean_hist CH3..sd_hist CH3..bg_hist CH3..contrast CH3..mean CH3..sd CH3..sat CH4..yolk_mean
1             0.556          1          23.93        0.569        21.75          2.52     24.15   0.900    209.3          238.1
2             0.582          1          24.08        0.588        21.97          2.40     24.31   0.924    210.6          251.4
3             0.545          1          23.18        0.586        21.15          2.17     23.42   0.919    202.8          197.5
4             0.582          1          23.19        0.596        20.87          2.63     23.42   0.931    202.8          173.7
5             0.590          1          24.18        0.595        21.96          2.40     24.40   0.947    211.5          251.5
6             0.586          1          24.47        0.584        22.09          2.63     24.69   0.919    214.0          243.5
  CH4..yolk_sd CH4..yolk_sat CH4..yolk_mean_hist CH4..yolk_sd_hist CH4..confi CH4..mean_hist CH4..sd_hist CH4..bg_hist CH4..contrast
1        9.395         203.8               235.8             5.996          1          235.4        5.664        215.5          20.8
2        9.316         215.2               249.1             6.067          1          249.3        5.993        226.3          24.3
3        8.171         168.9               195.5             4.997          1          195.0        5.075        175.3          22.0
4        7.831         148.5               171.9             5.060          1          172.2        5.046        152.0          22.0
5        9.639         215.2               249.1             6.137          1          248.8        5.948        225.5          25.5
6        9.336         208.5               241.3             5.793          1          241.5        5.709        219.1          24.3
  CH4..mean CH4..sd CH4..sat masked well       pos relevant..double_bead relevant..well_circle width_0 width_muem yolk_width_muem
1     237.5   9.031    203.3      1   A1 col1_row1                     1                     1      43    82.9675         41.1950
2     251.6   9.370    215.4      1   A1 col1_row1                     1                     1      48    92.2075         47.9325
3     196.9   7.965    168.5      1   A1 col1_row1                     1                     1      48    91.6300         52.3600
4     174.0   8.020    148.7      0   A1 col1_row1                     1                     0      45    86.4325         53.9000
5     250.9   9.137    215.0      1   A1 col1_row1                     1                     1      46    88.3575         43.8900
6     243.8   9.314    208.7      1   A1 col1_row1                     1                     1      48    91.6300         41.3875
  ratio_yolk_to_bead_width CH1..yolk_cv_hist CH2..yolk_cv_hist CH3..yolk_cv_hist CH4..yolk_cv_hist CH1..yolk_contrast
1                0.4965197             0.031             0.056             0.023             0.025                5.1
2                0.5198330             0.028             0.049             0.024             0.024                8.1
3                0.5714286             0.036             0.060             0.023             0.026                5.5
4                0.6236080             0.038             0.060             0.025             0.029                1.3
5                0.4967320             0.037             0.043             0.024             0.025                2.3
6                0.4516807             0.034             0.041             0.024             0.024                4.2
  CH2..yolk_contrast CH3..yolk_contrast CH4..yolk_contrast valid..image_seg valid..image_yolk_seg CH1..bg_hist_norm CH2..bg_hist_norm
1               5.69               0.34                2.7                1                     1              25.1              9.02
2               5.05               0.37                2.1                1                     1              26.7             10.27
3               4.65               0.42                2.5                1                     1              26.6              9.49
4               3.37               0.28                1.5                1                     0              22.9              5.14
5               4.66               0.32                2.7                1                     1              24.6              9.70
6               4.21               0.45                2.0                1                     1              26.0             10.78
  CH3..bg_hist_norm CH4..bg_hist_norm valid..bead_background valid..yolk_width_ratio CH1..yolk_contrast_norm CH2..yolk_contrast_norm
1             21.75             215.5                      0                     NaN                     5.1                    5.69
2             21.97             226.3                      0                     NaN                     8.1                    5.05
3             21.15             175.3                      1                       1                     5.5                    4.65
4             20.87             152.0                    NaN                     NaN                     1.3                    3.37
5             21.96             225.5                      0                     NaN                     2.3                    4.66
6             22.09             219.1                      0                     NaN                     4.2                    4.21
  CH3..yolk_contrast_norm CH4..yolk_contrast_norm CH1..contrast_norm CH2..contrast_norm CH3..contrast_norm CH4..contrast_norm
1                    0.34                     2.7               17.1               7.49               2.52               20.8
2                    0.37                     2.1               16.7               6.74               2.40               24.3
3                    0.42                     2.5               14.9               5.99               2.17               22.0
4                    0.28                     1.5                8.9               9.17               2.63               22.0
5                    0.32                     2.7                7.1               7.11               2.40               25.5
6                    0.45                     2.0               17.4               7.11               2.63               24.3
  bead_type      Bead.Type.Name valid..bead_width valid..yolk_width               Validity CH1..level CH3..level validJT
1         1 Analyte_1_Analyte_2                 1               NaN valid::bead_background    invalid    invalid invalid
2         1 Analyte_1_Analyte_2                 1               NaN valid::bead_background    invalid    invalid invalid
3         1 Analyte_1_Analyte_2                 1                 1                  valid   positive        NaN invalid
4         1 Analyte_1_Analyte_2                 1               NaN  relevant::well_circle    invalid    invalid invalid
5         1 Analyte_1_Analyte_2                 1               NaN valid::bead_background    invalid    invalid invalid
6         1 Analyte_1_Analyte_2                 1               NaN valid::bead_background    invalid    invalid invalid

WG1Raw <- ifelse(Channel == "CH1", subset(WG1[,c('class','CH1..contrast','validJT','well')]),
                 ifelse(Channel == "CH2", subset(WG1[,c('class','CH2..contrast','validJT','well')]),
                        ifelse(Channel == "CH3", subset(WG1[,c('class','CH3..contrast','validJT','well')]),subset(WG1[,c('class','CH4..contrast','validJT','well')]))))


This returns a list rather than the df that I want.

Thanks an advance


Solution

  • # mockup data:
    req_channel <- "CH1"
    
    WG1 <- data.frame(class = "WellA1_col1_row1",
                      cycle = 1,
                      id = 1:6,
                      x = runif(6),
                      y= runif(6),
                      CH1..yolk_sd = runif(6),
                      CH1..yolk_sat = runif(6),
                      CH1..yolk_mean_hist = runif(6),
                      CH2..yolk_sd = runif(6),
                      CH2..yolk_sat = runif(6),
                      CH2..yolk_mean_hist = runif(6),
                      CH3..yolk_sd = runif(6),
                      CH3..yolk_sat = runif(6),
                      CH3..yolk_mean_hist = runif(6),
                      CH4..yolk_sd = runif(6),
                      CH4..yolk_sat = runif(6),
                      CH4..yolk_mean_hist = runif(6)
                      )
    

    IMO the tidiest (or tidyest) way to approach this is to pivot wider, splitting the channel numbers out, then pivot the variables back, leaving the channel numbers behind.:

    WG1 |> 
      pivot_longer(cols = starts_with("CH"),
                   names_to = c("Channel", "Type"),
                   names_sep = "\\.+") |>
      pivot_wider(names_from = Type, values_from = value) |>
      filter(Channel %in% req_channel)
    

    gives:

    # A tibble: 6 × 9
      class            cycle    id     x     y Channel yolk_sd yolk_sat yolk_mean_hist
      <chr>            <dbl> <int> <dbl> <dbl> <chr>     <dbl>    <dbl>          <dbl>
    1 WellA1_col1_row1     1     1 0.399 0.139 CH1       0.855  0.576            0.347
    2 WellA1_col1_row1     1     2 0.934 0.152 CH1       0.140  0.497            0.718
    3 WellA1_col1_row1     1     3 0.421 0.582 CH1       0.381  0.00176          0.875
    4 WellA1_col1_row1     1     4 0.704 0.401 CH1       0.416  0.0469           0.376
    5 WellA1_col1_row1     1     5 0.230 0.373 CH1       0.287  0.804            0.274
    6 WellA1_col1_row1     1     6 0.880 0.320 CH1       0.363  0.941            0.955
    

    Why didn't your code work?

    from the help in subset() : "This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [,"

    Also, ifelse() returns "A vector of the same length and attributes (including dimensions and "class") as test"

    It works with if ... else and suitable subsetting e.g. `[`:

    if (req_channel == "CH1") {
      `[`(WG1,c("class","CH1..yolk_sd"))
    } else {
      if (req_channel == "CH2") {
        `[`(WG1,c("class","CH2..yolk_sd"))
      } else {
        if (req_channel == "CH3") {
          `[`(WG1,c("class","CH3..yolk_sd"))
        } else {
          `[`(WG1,c("class","CH4..yolk_sd"))
        }
      }
    }
    
                 class CH1..yolk_sd
    1 WellA1_col1_row1    0.8551552
    2 WellA1_col1_row1    0.1395101
    3 WellA1_col1_row1    0.3811618
    4 WellA1_col1_row1    0.4162325
    5 WellA1_col1_row1    0.2867033
    6 WellA1_col1_row1    0.3631422
    

    note: [`(WG1, ...) is more normally written WG1[,...]:

    if (req_channel == "CH1") {   WG1[,c("class","CH1..yolk_sd")] } else {
      if (req_channel == "CH2") { WG1[,c("class","CH2..yolk_sd")] } else {
      if (req_channel == "CH3") { WG1[,c("class","CH3..yolk_sd")] } else {
                                  WG1[,c("class","CH4..yolk_sd")] }
    }
    

    You CAN use the ifelse(,,ifelse(,,ifelse(...))) construction if you put it inside the subset:

    WG1[,c("class", ifelse(req_channel == "CH1", "CH1..yolk_sd",
                           ifelse(req_channel == "CH2", 'CH2..yolk_sd',
                                  ifelse(req_channel == "CH3", 'CH3..yolk_sd',
                                         'CH4..yolk_sd'))))]
                 class CH1..yolk_sd
    1 WellA1_col1_row1    0.8551552
    2 WellA1_col1_row1    0.1395101
    3 WellA1_col1_row1    0.3811618
    4 WellA1_col1_row1    0.4162325
    5 WellA1_col1_row1    0.2867033
    6 WellA1_col1_row1    0.3631422