Search code examples
rstringr

Split specific column in data frame in R


I am working with R. Below you can see my code and my data:

df <- data.frame(
  R1 = c("10 EFTA : 0 / BAA/GBR : 0 / ES : 2", "10","0"),
  R2 = c("-", "EFTA : 0 / BAA/GBR : 0 / ES : 2","18")
)

enter image description here

Now I want to split first column R1. At the beginning I want to split first row with following command:

df[c('R1', 'R2')] <- str_split_fixed(df$R1, ' ', 2)

This line split first row exactly that I needed and below you can see how is look like a data frame now.

enter image description here

But problem arise with next rows. Namely, now you can see that this value "EFTA : 0 / BAA/GBR : 0 / ES : 2" and also in the next row value 18 is missing. So can anybody help me how to solve this problem and to have df like df shown below :

enter image description here


Solution

  • You can first find matching rows and then only process those matches:

    library(stringr)
    df <- data.frame(
      R1 = c("10 EFTA : 0 / BAA/GBR : 0 / ES : 2", "10","0"),
      R2 = c("-", "EFTA : 0 / BAA/GBR : 0 / ES : 2","18")
    )
    df
    #>                                   R1                              R2
    #> 1 10 EFTA : 0 / BAA/GBR : 0 / ES : 2                               -
    #> 2                                 10 EFTA : 0 / BAA/GBR : 0 / ES : 2
    #> 3                                  0                              18
    
    # boolean index to match rows where R1 includes " ":
    spaces_in_r1 <- str_detect(df$R1, fixed(" "))
    spaces_in_r1
    #> [1]  TRUE FALSE FALSE
    
    df[spaces_in_r1 ,c('R1', 'R2')] <- str_split_fixed(df$R1[spaces_in_r1], ' ', 2)
    df
    #>   R1                              R2
    #> 1 10 EFTA : 0 / BAA/GBR : 0 / ES : 2
    #> 2 10 EFTA : 0 / BAA/GBR : 0 / ES : 2
    #> 3  0                              18
    

    Created on 2023-10-09 with reprex v2.0.2