My dataframe has got 58000+ rows and 26 columns. I will delete my rows for 58000+ rows.And delete rows if more than 5% of variables are NA.
We can use rowMeans
on a logical matrix created with is.na
df1[rowMeans(is.na(df1)) <= 0.5, , drop = FALSE]
In the above code, the is.na(df1)
returns a logical matrix of TRUE (for NA) and FALSE (for non-NA), with rowMeans
, we compute the percentage of TRUE values in row, check if it is less than or equal to 0.5 to create logical vector and subset the rows by using it as row index