Search code examples
rtidyversetidyrdata-manipulation

How to fill in blanks based off another column's values in R


Here is the code and resulting dataframe

category <- c("East","BLANK","NorthEast","BLANK","BLANK")
subcat <- c("East","North","SW","NE","SE")
data1 <- as.data.frame(category)
data1$subcat <- subcat


**Category**  **Subcat**
East        East            
BLANK       North           
NorthEast   SW          
BLANK       NE          
BLANK       SE

The East category contains subcat East and North. The NorthEast category contains subcat SW,NE,SE. As you can see there are blanks for each category. How would I make so the 2nd value in Category is East and 4th and 5th row is North East? I have many more rows in the actual data so a way to do this would be helpful.

The result should be

**Category**  **Subcat**
East           East         
*East*         North            
NorthEast      SW           
*NorthEast*    NE           
*NorthEast*    SE

Solution

  • We could convert the 'BLANK' to NA and use fill

    library(tidyr)
    library(dplyr)
    data1 %>%
       na_if( "BLANK") %>%
       fill(category)
    

    -output

        category subcat
    1      East   East
    2      East  North
    3 NorthEast     SW
    4 NorthEast     NE
    5 NorthEast     SE