Search code examples
rdataframena

dropping NA in a dataframe in R


I am working with the lines below:

library(quantmod)
library(tidyr)

start="2022-01-01"
end="2022-11-01"
getSymbols("0386.HK",from=start, to=end)
getSymbols("600028.SS",from=start,to=end)
CPC_H<-`0386.HK`[,6]
CPC_A<-`600028.SS`[,6]
df<-cbind(CPC_A,CPC_H)

There are some NAs in the dataframe from either varaible, so I am trying to remove the entire row if either shows NA.

> head(df)
           X600028.SS.Adjusted X0386.HK.Adjusted
2022-01-03                  NA          3.197658
2022-01-04            3.555848          3.223655
2022-01-05            3.572542          3.327643
2022-01-06            3.564195          3.318978
2022-01-07            3.630971          3.388304
2022-01-10            3.622625          3.344975

I try to drop NA using the following line n return with an error msg:

df%>%
  drop_na(CPC_H)

Error in UseMethod("drop_na") : 
  no applicable method for 'drop_na' applied to an object of class "c('xts', 'zoo')"

Can anyone tell me is there a better way to do it? Many thanks n have a good day.


Solution

  • I believe na.omit should do the same if you apply it to an object with the same name of your dataframe. But it will remove every line with NA.

    df<-na.omit(df)

    If you have more columns with NA values and you need to remove lines with NA only from these two specific columns, you should do like this:

    df<-subset(df,!is.na(column_1) | !is.na(column_2))

    This code will filter your database, keeping only rows that do not have NA in any of the columns.