I have a data.table with a large number of features. I would like to remove the rows where the values are NAs only for certain features.
Currently I am using the following to handle this:
data.joined.sample <- data.joined.sample %>%
filter(!is.na(lat)) %>%
filter(!is.na(long)) %>%
filter(!is.na(temp)) %>%
filter(!is.na(year)) %>%
filter(!is.na(month)) %>%
filter(!is.na(day)) %>%
filter(!is.na(hour)) %>%
.......
Is there a more concise way to achieve this?
str(data.joined.sample)
Classes ‘data.table’ and 'data.frame': 336776 obs. of 50 variables:
We can select those columns, get a logical vector
of NA's based on it using complete.cases
and use that to remove the NA elements
data.joined.sample[complete.cases(data.joined.sample[colsofinterest]),]
where
colsofinterest <- c("lat", "long", "temp", "year", "month", "day", "hour")
Based on the OP's comments, if it is a data.table, then subset the colsofinterest
and use complete.cases
data.joined.sample[complete.cases(data.joined.sample[, colsofinterest, with = FALSE])]