Search code examples
rloopsmatchprompt

Update column by prompting user input R


I have a dataframe in which I need to sift through each row manually and determine if the columns I've matched using the RecordLinkage package are indeed a match. Some of the records have a high probability of being a match when they aren't simply due to spurious association. I'd like to quickly identify these without exporting my data to a csv and scrolling through them case by case. What I'd like instead is to iterate through each row of the data, and for each row prompt the user (me) with a question "is this a match (y/n)?", where the answer ('yes' or 'no') gets input into a column for that row.

This code will reproduce a quick example of data,

id= c(1, 2, 3, 4)
loc1 = c("21ST AVE", "5TH ST", "HICKMAN ST", "GULF DR")
loc2 = c("21ST AVE BEACH ST", "5 EAST HARPER BLVD", "28 HARLEY ST", "1000 GULF DR")
day1 = c(12, 13, 14, 15)
day2 = c(12, 13, 14, 15)
time1 = c("20:52", "12:52", "15:35", "14:45")
time2 = c("20:52", "18:29", "03:55", "15:01")
df = data.frame(id, loc1, loc2, day1, day2, time1, time2)

Providing this result,

id  loc1        loc2                day1    day2    time1   time2
1   21ST AVE    21ST AVE BEACH ST   12      12      20:52   20:52
2   5TH ST      5 EAST HERST BLVD   13      13      12:52   18:29
3   HICKMAN ST  28 HARLEY ST        14      14      15:35   03:51
4   GULF DR     1000 GULF DR        15      15      14:45   15:01

What I'd like is for a prompt to ask

Is this a match (y/n)?
----------------------
id  loc1        loc2                day1    day2    time1   time2
1   21ST AVE    21ST AVE BEACH ST   12      12      20:52   20:52

Whereby answering yes or no on each row would give the following result,

id  loc1        loc2                day1    day2    time1   time2    match
1   21ST AVE    21ST AVE BEACH ST   12      12      20:52   20:52    y
2   5TH ST      5 EAST HERST BLVD   13      13      12:52   18:29    n
3   HICKMAN ST  28 HARLEY ST        14      14      15:35   03:55    n
4   GULF DR     1000 GULF DR        15      15      14:45   15:01    y

I'm not even sure if this is a) possible, b) feasible, or c) the best way to go about it. Open to thoughts/suggestions. Thanks.


Solution

  • First make a function...

    checkRow<-function(df){
      match<-vector()
      for(i in 1:nrow(df)){
        print(df[i,])
        ans<-readline("Is this a match? (y or n)")
        match<-c(match, ans)
      }
      return(cbind(df, match))
    }
    

    Then call it as such:

    checked<-checkRow(df)