Search code examples
rforcats

Collapsing of factors


Collapsing of factor levels with forcats::fct_collapse leads to unexpected result

It follows some modified code from the example of fct_collapse

require(forcats)
partyid2 <- fct_collapse(gss_cat$partyid,
                         missing = c("No answer"),
                         other = "Other party",
                         rep = c("Strong republican", "Not str republican"),
                         ind = c("Ind,near rep", "Independent", "Ind,near dem"),
                         dem = c("Not str democrat", "Strong democrat"),
                         group_other = TRUE
)
table(gss_cat$partyid, partyid2)

Why, for example, does the level 'Strong democrat' end up in the level 'Other' ?

Thank you very much for a hint, what I'm doing wrong.

partyid2
                     missing other  rep  ind  dem Other
  No answer              154     0    0    0    0     0
  Don't know               0     1    0    0    0     0
  Other party              0     0  393    0    0     0
  Strong republican        0     0 2314    0    0     0
  Not str republican       0     0    0 3032    0     0
  Ind,near rep             0     0    0 1791    0     0
  Independent              0     0    0 4119    0     0
  Ind,near dem             0     0    0    0 2499     0
  Not str democrat         0     0    0    0 3690     0
  Strong democrat          0     0    0    0    0  3490

Solution

  • The code in the example is not correct. It changes the order. TO keep it in the same order

    partyid2 <- fct_collapse(levels(gss_cat$partyid),
                             missing = c("No answer"),
                              other = "Other party",
                              rep = c("Strong republican", "Not str republican"),
                              ind = c("Ind,near rep", "Independent", "Ind,near dem"),
                              dem = c("Not str democrat", "Strong democrat"),
                              group_other = TRUE
     )[gss_cat$partyid] 
    table(gss_cat$partyid, partyid2)
    #              partyid2
    #                     missing other  rep  ind  dem Other
    #  No answer                0     0    0  154    0     0
    #  Don't know               1     0    0    0    0     0
    #  Other party              0     0    0    0  393     0
    #  Strong republican        0     0    0    0    0  2314
    #  Not str republican       0     0    0 3032    0     0
    #  Ind,near rep             0     0 1791    0    0     0
    #  Independent              0     0 4119    0    0     0
    #  Ind,near dem             0  2499    0    0    0     0
    #  Not str democrat         0     0    0 3690    0     0
    #  Strong democrat          0     0    0    0 3490     0