Search code examples
rdata-cleaning

How do I remove orphaned partner columns from a dataset?


R and coding noob here so I'm probably approaching this in entirely the wrong way. Appreciate any help or guidance. I have a report that spits out sets of columns for various metrics, during data cleanup some of these columns are deleted (all NAs, all 0s, static values, etc.) but not always in their original pairs. To make the data easier to work with I'd like to delete the orphaned partner columns but I'm not sure how to go about it. They do have a predictable naming scheme if that helps.

Reproducible dataset:

df2 <- mtcars
names(df2) <- paste0(colnames(mtcars), ".partner")
dataset <- cbind(mtcars, df2)
num <- c(2, 4, 12, 16, 19, 22)
dataset <- dataset[, -num]
head(dataset)

In my case the paired columns are adjacent but I don't think that matters.

Expected outcome: Ability to delete any column that doesn't have a pair. In this example that would be (orphaned columns in bold):

"mpg", "mpg.partner", "drat", "wt", "wt.partner", "vs", "am", "am.partner", "gear", "gear.partner", "carb", "carb.partner", "cyl.partner", "disp.partner", "hp.partner"


Solution

  • This may work, though your expected output is not clear

    library(tidyverse)
    
    df2 <- mtcars
    names(df2) <- paste0(colnames(mtcars), ".partner")
    dataset <- cbind(mtcars, df2)
    num <- c(2, 4, 12, 16, 19, 22)
    dataset <- dataset[, -num]
    
    colnames(dataset) %>% 
      {intersect(str_remove(.[str_detect(., ".partner")], ".partner"),
                 .[str_detect(., ".partner", negate = TRUE)])
                 } %>% 
      {dataset[c(., paste0(., ".partner"))]}
    #>                      disp    wt  qsec am gear disp.partner wt.partner
    #> Mazda RX4           160.0 2.620 16.46  1    4        160.0      2.620
    #> Mazda RX4 Wag       160.0 2.875 17.02  1    4        160.0      2.875
    #> Datsun 710          108.0 2.320 18.61  1    4        108.0      2.320
    #> Hornet 4 Drive      258.0 3.215 19.44  0    3        258.0      3.215
    #> Hornet Sportabout   360.0 3.440 17.02  0    3        360.0      3.440
    #> Valiant             225.0 3.460 20.22  0    3        225.0      3.460
    #> Duster 360          360.0 3.570 15.84  0    3        360.0      3.570
    #> Merc 240D           146.7 3.190 20.00  0    4        146.7      3.190
    #> Merc 230            140.8 3.150 22.90  0    4        140.8      3.150
    #> Merc 280            167.6 3.440 18.30  0    4        167.6      3.440
    #> Merc 280C           167.6 3.440 18.90  0    4        167.6      3.440
    #> Merc 450SE          275.8 4.070 17.40  0    3        275.8      4.070
    #> Merc 450SL          275.8 3.730 17.60  0    3        275.8      3.730
    #> Merc 450SLC         275.8 3.780 18.00  0    3        275.8      3.780
    #> Cadillac Fleetwood  472.0 5.250 17.98  0    3        472.0      5.250
    #> Lincoln Continental 460.0 5.424 17.82  0    3        460.0      5.424
    #> Chrysler Imperial   440.0 5.345 17.42  0    3        440.0      5.345
    #> Fiat 128             78.7 2.200 19.47  1    4         78.7      2.200
    #> Honda Civic          75.7 1.615 18.52  1    4         75.7      1.615
    #> Toyota Corolla       71.1 1.835 19.90  1    4         71.1      1.835
    #> Toyota Corona       120.1 2.465 20.01  0    3        120.1      2.465
    #> Dodge Challenger    318.0 3.520 16.87  0    3        318.0      3.520
    #> AMC Javelin         304.0 3.435 17.30  0    3        304.0      3.435
    #> Camaro Z28          350.0 3.840 15.41  0    3        350.0      3.840
    #> Pontiac Firebird    400.0 3.845 17.05  0    3        400.0      3.845
    #> Fiat X1-9            79.0 1.935 18.90  1    4         79.0      1.935
    #> Porsche 914-2       120.3 2.140 16.70  1    5        120.3      2.140
    #> Lotus Europa         95.1 1.513 16.90  1    5         95.1      1.513
    #> Ford Pantera L      351.0 3.170 14.50  1    5        351.0      3.170
    #> Ferrari Dino        145.0 2.770 15.50  1    5        145.0      2.770
    #> Maserati Bora       301.0 3.570 14.60  1    5        301.0      3.570
    #> Volvo 142E          121.0 2.780 18.60  1    4        121.0      2.780
    #>                     qsec.partner am.partner gear.partner
    #> Mazda RX4                  16.46          1            4
    #> Mazda RX4 Wag              17.02          1            4
    #> Datsun 710                 18.61          1            4
    #> Hornet 4 Drive             19.44          0            3
    #> Hornet Sportabout          17.02          0            3
    #> Valiant                    20.22          0            3
    #> Duster 360                 15.84          0            3
    #> Merc 240D                  20.00          0            4
    #> Merc 230                   22.90          0            4
    #> Merc 280                   18.30          0            4
    #> Merc 280C                  18.90          0            4
    #> Merc 450SE                 17.40          0            3
    #> Merc 450SL                 17.60          0            3
    #> Merc 450SLC                18.00          0            3
    #> Cadillac Fleetwood         17.98          0            3
    #> Lincoln Continental        17.82          0            3
    #> Chrysler Imperial          17.42          0            3
    #> Fiat 128                   19.47          1            4
    #> Honda Civic                18.52          1            4
    #> Toyota Corolla             19.90          1            4
    #> Toyota Corona              20.01          0            3
    #> Dodge Challenger           16.87          0            3
    #> AMC Javelin                17.30          0            3
    #> Camaro Z28                 15.41          0            3
    #> Pontiac Firebird           17.05          0            3
    #> Fiat X1-9                  18.90          1            4
    #> Porsche 914-2              16.70          1            5
    #> Lotus Europa               16.90          1            5
    #> Ford Pantera L             14.50          1            5
    #> Ferrari Dino               15.50          1            5
    #> Maserati Bora              14.60          1            5
    #> Volvo 142E                 18.60          1            4
    

    or simply if you those columns in mtcars

    colnames(dataset) %>% 
      {intersect(str_remove(.[str_detect(., ".partner")], ".partner"),
                 .[str_detect(., ".partner", negate = TRUE)])
      } %>% 
      {mtcars[.]}