R and coding noob here so I'm probably approaching this in entirely the wrong way. Appreciate any help or guidance. I have a report that spits out sets of columns for various metrics, during data cleanup some of these columns are deleted (all NAs, all 0s, static values, etc.) but not always in their original pairs. To make the data easier to work with I'd like to delete the orphaned partner columns but I'm not sure how to go about it. They do have a predictable naming scheme if that helps.
Reproducible dataset:
df2 <- mtcars
names(df2) <- paste0(colnames(mtcars), ".partner")
dataset <- cbind(mtcars, df2)
num <- c(2, 4, 12, 16, 19, 22)
dataset <- dataset[, -num]
head(dataset)
In my case the paired columns are adjacent but I don't think that matters.
Expected outcome: Ability to delete any column that doesn't have a pair. In this example that would be (orphaned columns in bold):
"mpg", "mpg.partner", "drat", "wt", "wt.partner", "vs", "am", "am.partner", "gear", "gear.partner", "carb", "carb.partner", "cyl.partner", "disp.partner", "hp.partner"
This may work, though your expected output is not clear
library(tidyverse)
df2 <- mtcars
names(df2) <- paste0(colnames(mtcars), ".partner")
dataset <- cbind(mtcars, df2)
num <- c(2, 4, 12, 16, 19, 22)
dataset <- dataset[, -num]
colnames(dataset) %>%
{intersect(str_remove(.[str_detect(., ".partner")], ".partner"),
.[str_detect(., ".partner", negate = TRUE)])
} %>%
{dataset[c(., paste0(., ".partner"))]}
#> disp wt qsec am gear disp.partner wt.partner
#> Mazda RX4 160.0 2.620 16.46 1 4 160.0 2.620
#> Mazda RX4 Wag 160.0 2.875 17.02 1 4 160.0 2.875
#> Datsun 710 108.0 2.320 18.61 1 4 108.0 2.320
#> Hornet 4 Drive 258.0 3.215 19.44 0 3 258.0 3.215
#> Hornet Sportabout 360.0 3.440 17.02 0 3 360.0 3.440
#> Valiant 225.0 3.460 20.22 0 3 225.0 3.460
#> Duster 360 360.0 3.570 15.84 0 3 360.0 3.570
#> Merc 240D 146.7 3.190 20.00 0 4 146.7 3.190
#> Merc 230 140.8 3.150 22.90 0 4 140.8 3.150
#> Merc 280 167.6 3.440 18.30 0 4 167.6 3.440
#> Merc 280C 167.6 3.440 18.90 0 4 167.6 3.440
#> Merc 450SE 275.8 4.070 17.40 0 3 275.8 4.070
#> Merc 450SL 275.8 3.730 17.60 0 3 275.8 3.730
#> Merc 450SLC 275.8 3.780 18.00 0 3 275.8 3.780
#> Cadillac Fleetwood 472.0 5.250 17.98 0 3 472.0 5.250
#> Lincoln Continental 460.0 5.424 17.82 0 3 460.0 5.424
#> Chrysler Imperial 440.0 5.345 17.42 0 3 440.0 5.345
#> Fiat 128 78.7 2.200 19.47 1 4 78.7 2.200
#> Honda Civic 75.7 1.615 18.52 1 4 75.7 1.615
#> Toyota Corolla 71.1 1.835 19.90 1 4 71.1 1.835
#> Toyota Corona 120.1 2.465 20.01 0 3 120.1 2.465
#> Dodge Challenger 318.0 3.520 16.87 0 3 318.0 3.520
#> AMC Javelin 304.0 3.435 17.30 0 3 304.0 3.435
#> Camaro Z28 350.0 3.840 15.41 0 3 350.0 3.840
#> Pontiac Firebird 400.0 3.845 17.05 0 3 400.0 3.845
#> Fiat X1-9 79.0 1.935 18.90 1 4 79.0 1.935
#> Porsche 914-2 120.3 2.140 16.70 1 5 120.3 2.140
#> Lotus Europa 95.1 1.513 16.90 1 5 95.1 1.513
#> Ford Pantera L 351.0 3.170 14.50 1 5 351.0 3.170
#> Ferrari Dino 145.0 2.770 15.50 1 5 145.0 2.770
#> Maserati Bora 301.0 3.570 14.60 1 5 301.0 3.570
#> Volvo 142E 121.0 2.780 18.60 1 4 121.0 2.780
#> qsec.partner am.partner gear.partner
#> Mazda RX4 16.46 1 4
#> Mazda RX4 Wag 17.02 1 4
#> Datsun 710 18.61 1 4
#> Hornet 4 Drive 19.44 0 3
#> Hornet Sportabout 17.02 0 3
#> Valiant 20.22 0 3
#> Duster 360 15.84 0 3
#> Merc 240D 20.00 0 4
#> Merc 230 22.90 0 4
#> Merc 280 18.30 0 4
#> Merc 280C 18.90 0 4
#> Merc 450SE 17.40 0 3
#> Merc 450SL 17.60 0 3
#> Merc 450SLC 18.00 0 3
#> Cadillac Fleetwood 17.98 0 3
#> Lincoln Continental 17.82 0 3
#> Chrysler Imperial 17.42 0 3
#> Fiat 128 19.47 1 4
#> Honda Civic 18.52 1 4
#> Toyota Corolla 19.90 1 4
#> Toyota Corona 20.01 0 3
#> Dodge Challenger 16.87 0 3
#> AMC Javelin 17.30 0 3
#> Camaro Z28 15.41 0 3
#> Pontiac Firebird 17.05 0 3
#> Fiat X1-9 18.90 1 4
#> Porsche 914-2 16.70 1 5
#> Lotus Europa 16.90 1 5
#> Ford Pantera L 14.50 1 5
#> Ferrari Dino 15.50 1 5
#> Maserati Bora 14.60 1 5
#> Volvo 142E 18.60 1 4
or simply if you those columns in mtcars
colnames(dataset) %>%
{intersect(str_remove(.[str_detect(., ".partner")], ".partner"),
.[str_detect(., ".partner", negate = TRUE)])
} %>%
{mtcars[.]}