Search code examples
rdataframeglmo-d-matrix

Computing origin-destination fixed effects with R


I am using a gravity equation with various types of fixed-effects. Origin fixed-effect, destination fixed-effect, and origin-destination pair fixed-effect.

Consider the following example

require(dplyr)
mydf <- data_frame(orig = rep(LETTERS[1:3], each = 3),
                   dest = rep(LETTERS[1:3], times = 3))

Origin and destination fixed-effects can be created using factors

mydf <- mutate(mydf,
               orig_fe = factor(orig),
               dest_fe = factor(dest))

Now I want to perform the same operation on origin-destination pairs. For instance the AB combination should take the same value as the BA combination. Of course, this variable should be a factor too.

The expected result is the following

mydf$pair_fe = as.factor(c('AA', 'AB', 'AC', 'AB', 'BB', 'BC', 'AC', 'BC', 'CC'))

mydf

#      orig  dest orig_fe dest_fe pair_fe
#     (chr) (chr)  (fctr)  (fctr)  (fctr)
# 1     A     A       A       A      AA
# 2     A     B       A       B      AB
# 3     A     C       A       C      AC
# 4     B     A       B       A      AB
# 5     B     B       B       B      BB
# 6     B     C       B       C      BC
# 7     C     A       C       A      AC
# 8     C     B       C       B      BC
# 9     C     C       C       C      CC

Solution

  • We can use pmax and pmin to get the rowwise maximum and minimum values, then paste the vectors together, and convert to factor class.

    mydf %>% 
         mutate(pair_fe= factor(paste0(pmin(orig,dest), pmax(orig,dest))))
    #    orig  dest orig_fe dest_fe pair_fe
    #   (chr) (chr)  (fctr)  (fctr)  (fctr)
    #1     A     A       A       A      AA
    #2     A     B       A       B      AB
    #3     A     C       A       C      AC
    #4     B     A       B       A      AB
    #5     B     B       B       B      BB
    #6     B     C       B       C      BC
    #7     C     A       C       A      AC
    #8     C     B       C       B      BC
    #9     C     C       C       C      CC