Search code examples
rdplyrmutate

How to use dplyr mutate function on string values to fill the data in one column


See my data: Input data

I want to fill the column EXPIRATION_DATE as per conditions in STATE_CODE column.

So I am trying this: (Input is a dataframe)

Input %>%mutate(EXPIRATION_DATE=if_else(STATE_CODE=='IN'),'01/01/2012', if_else(STATE_CODE=='IL'),'01/01/2011');

However, this command isn't working.

Do You know why I am not able to use mutate with if_else?

My desired result: Wherever there is IL in STate code, I want to fill 01/01/2011 in Expiration_Date column and IN in STate code, EXPIRATION_DATE= 01/01/2012.

These dates should be entered in text format. So I am using '' for inserting.


Solution

  • You can use if_else() or case_when() functions:

    First, a reproductible example:

    df <- tibble(STATE_CODE = c("IL", "IL", "IN", "IN", "IL", "IN"),
                 EXPIRATION_DATE = c(NA, NA, NA, NA, NA, NA))
    
    # A tibble: 6 × 2
      STATE_CODE EXPIRATION_DATE
      <chr>      <lgl>          
    1 IL         NA             
    2 IL         NA             
    3 IN         NA             
    4 IN         NA             
    5 IL         NA             
    6 IN         NA             
    

    With if_else() you need to specify the condition = and what happen if the condition is true = and if it is false = .

    df %>% 
      mutate(EXPIRATION_DATE = if_else(condition = STATE_CODE == 'IN', 
                                       true = "01/01/2012", 
                                       false = "01/01/2011"))
    
    # A tibble: 6 × 2
      STATE_CODE EXPIRATION_DATE
      <chr>      <chr>          
    1 IL         01/01/2011     
    2 IL         01/01/2011     
    3 IN         01/01/2012     
    4 IN         01/01/2012     
    5 IL         01/01/2011     
    6 IN         01/01/2012     
    

    You can also use case_when(). The pattern is: "CONDITION ~ RESULT".

    df %>% 
      mutate(EXPIRATION_DATE = case_when(STATE_CODE == 'IN' ~ "01/01/2012",
                                         STATE_CODE == "IL" ~ "01/01/2011"))
    
    # A tibble: 6 × 2
      STATE_CODE EXPIRATION_DATE
      <chr>      <chr>          
    1 IL         01/01/2011     
    2 IL         01/01/2011     
    3 IN         01/01/2012     
    4 IN         01/01/2012     
    5 IL         01/01/2011     
    6 IN         01/01/2012