Search code examples
rdataframemismatch

Count date mismatches between two columns


I have two columns of date of birth (in the data frame) for each person in my Dataset. I want R to count the times when the values in both columns are not the same. That means counting the number of people for whom two birth date columns are not the same.

I tried to create a loop to get a vector (it's length = number of people in the dataset) when 1 = unequal dates of birth.

x=rep(0,4092)
  for (i in length(x)){
    if(mydata$datebirth1[i]== (mydata$datebirth2[i]){x[i]=FALSE}
    else{x[i]=TRUE}
  } 
x

Note that I have some NA values in the 2 columns


Solution

  • First, you create a new column of 1s and 0s, where 1 means the two dates don't match.

    df <- transform(df, c= ifelse(a==b, 0, 1))
    

    Then you can easily apply sum to the column you've created:

    sum(df$c)
    

    For the future, please consider providing the code of the solutions you've tried so far.