Search code examples
rmatchingranking

ranked preference matching in R


I'm not sure how to best describe my problem, but I am working on a scheduling project. I have a data frame containing the professors, courses, and time slots as well as several columns each containing a randomly generated number. I'd like to use these random numbers to generate different schedule options.

This is what I have:

  Prof     Courses   Time      Option_1  Option_2
  John     Course A   Time 1   0.7765824 0.3102492
  John     Course A   Time 2   0.5636233 0.4839778
  John     Course B   Time 1   0.5814365 0.7282360
  John     Course B   Time 2   0.2623851 0.5198096

And, this is what I want:

  Prof     Courses   Time      Option_1  Option_2
  John     Course A   Time 1   1         0
  John     Course A   Time 2   0         1
  John     Course B   Time 1   0         1
  John     Course B   Time 2   1         0

For Option 1, 0.7765824 is the highest number, so it is changed to a one - meaning that courses will be taught in that time slot. The next highest number - for a course not scheduled and a time slot filled - is 0.2623851, so it is changed to a 1.

For Option 2, 0.7282360 is the highest number, so it changes to 1.Then, then 0.4839778 changes to 1, since it is the highest number for a course not scheduled and a time slot not filled.

The real data involves a couple hundred professors teaching varying numbers of courses and hundreds to options, so the solution needs to be able to work group_by() function (or something similar) and be flexible enough to account for professors teaching varying number of courses.

Any ideas?


Solution

  • This loop should take care of it on a per option basis, the df does get returned in a new order though.

    df <- df[order(df$Option_1,decreasing =T),]
    coursesdone <- c("FakeCourse")
    timedone <- c("FakeTime")
    for(i in 1:length(unique(df$Courses))){
        available <- df$Courses != coursesdone & df$Time != timedone
        df$Option_1[available][1] = 1
        df$Option_1[df$Courses == df$Courses[available][1] & df$Time != df$Time[available][1]] <- 0 
        coursesdone[i] <- as.character(df$Courses[available][1])
        timedone[i] <- as.character(df$Time[available][1])
    }
    

    but we can stack the loop for however many options you have

    df <- df[order(df$Option_2,decreasing =T),]
    coursesdone <- c("FakeCourse")
    timedone <- c("FakeTime")
    for(i in 1:length(unique(df$Courses))){
        available <- df$Courses != coursesdone & df$Time != timedone
        df$Option_2[available][1] = 1
        df$Option_2[df$Courses == df$Courses[available][1] & df$Time != df$Time[available][1]] <- 0 
        coursesdone[i] <- as.character(df$Courses[available][1])
        timedone[i] <- as.character(df$Time[available][1])
    }
    

    to get a final output (once both loops have been executed)

    > df 
      Prof Courses  Time Option_1 Option_2
    3 John CourseB Time1        0        1
    4 John CourseB Time2        1        0
    2 John CourseA Time2        0        1
    1 John CourseA Time1        1        0