Search code examples
rmissing-data

R missing value. There are two data set, at the same ID, CUSIP & DATE, one is fine for its data, another has NA. How to fit it by Complete database?


There are two data set, A & B, as below

A

id  CUSIP  name  day 
  01  00256  ALEX  20170101
  02  00259  BEAR  20170101
  03  00258  CAT   20170101

B

  id  CUSIP  name  day
  01  00256  NA    20170101
  06  00259  BEAR  20170106
  09  00258  CAT   20170109

There is an NA in data set B, but we can see at the same CUSIP the name's column of A data set is not an NA, it is ALEX.

How can I use the data in the A database to fill the B database under the same CUSIP? So as to make the whole thing like below:

B

id  CUSIP  name  day
  01  00256  ALEX  20170101
  06  00259  BEAR  20170106
  09  00258  CAT   20170109

Solution

  • Given the information above:

     A=data.frame(id=c(01,02,03), CUSIP=c(00256,00259 ,00258 ),
             name=c("ALEX","BEAR","CAT") ,day=c("2017-01-01" , "2017-01-01","2017-01-01")
             ,stringsAsFactors = F)
    
     B=data.frame(id=c(01,06,09),CUSIP=c(00256,00259 ,00258 ), 
             name=c(NA,"BEAR","CAT"),day=c("2017-01-01" ,  "2017-01-06","2017-01-09"),
             stringsAsFactors = F)
    

    To be able to fill B using A :

    dplyr::coalesce(B,A)
      id CUSIP name        day
    1  1   256 ALEX 2017-01-01
    2  6   259 BEAR 2017-01-06
    3  9   258  CAT 2017-01-09