Search code examples
rvariablesdplyrcase-when

Constructing a new variable in R using Case when


Trying to construct a new variable, size class. Size class would be based on employment using the following structure

     Sizeclass        Employment Range
         1                    1-9
         2                    10-99
         3                    100-499
         4                    500-999

Here is a sample data set

      acct        Employment
        1             4
        2             12
        3             1
        4             54
        5             234
        6             13
        7             654
        8             101

As of yet, this is the code that I am trying to use

            sizeclass %>%
            select(uiacct, naics, employment) %>%
            mutate(sizeclass = case_when (employment >=1 and employment <9 ~ "1",)
            employment >=10 and employment<=99 ~ "2"))

Searched the internet but found very little that combined Case_when, mutate, and inequalities. I know that the inequalities are not set up correct. My primary question is if this is the correct structure for creating this new variable?

NOTE: all three answers worked. Quite amazing how helpful this site is when it comes to R


Solution

  • Try the following:

    df <- data.frame(
      acct = c(1:8),
      employment = c(4,12,1,54,234,13,654,101)
    )
    
    df <-  setDT(df)
    
    df <- df[,`:=`(
      Sizeclass = case_when(
        Employment >= 1 & Employment <= 9 ~ 1,
        Employment >= 10 & Employment <= 99 ~ 2,
        Employment >= 100 & Employment <= 499 ~ 3,
        Employment >= 500 & Employment <= 999 ~ 4
      )
    )]
    

    In case you dataset is a 'dataframe' then first convert it to 'datatable'