Subset data to contain only columns whose names match multiple condition using data.table

This is based upon this question. The question said:

Is there a way for me to subset data based on column names starting with a particular string? I have some columns which are like ABC_1 ABC_2 ABC_3 and some like XYZ_1, XYZ_2,XYZ_3 let's say.

How can I subset my df based only on columns containing the above portions of text (lets say, ABC or XYZ)? I can use indices, but the columns are too scattered in data and it becomes too much of hard coding.

Also, I want to only include rows from each of these columns where any of their value is >0 so if either of the 6 columns above has a 1 in the row, it makes a cut into my final data frame.

One of the answers used dplyr to select multiple columns for the new data:

df <- df %>% dplyr:: select(grep("ABC", names(df)), grep("XYZ", names(df)))

However, I'd like a data.table solution similar to the above. I know that (as per one of the answers) that data.table can do one condition, but unsure how to do multiple:

df[, grep("ABC", names(df)), with = FALSE]

Solution

You can select multiple columns that match certain patterns in data.table using patterns in its .SDcols argument:

# turn df into data.table
setDT(df)

# select columns that contain ABD or XYZ
df[, .SD, .SDcols=patterns("ABC|XYZ")]

# or
df[, grep("ABC|XYZ", names(df)), with=FALSE]

subsetting rows and columns at the same time

cols = grep("ABC|XYZ",  names(df))

df[rowSums(df[, ..cols]>0)>0, .SD, .SDcols=cols]