Search code examples
rtidyrtibble

Read human-readable concatenated tables to a tibble


I have some human-readable tab-separated data with the following form:

Table_1
10      20      30
30      40      50
10      20      40
Table_2
20      30      40
10      50      60
30      20      10
30      10      40
Table_3
20      30      50
30      50      40

I would like to preserve the table names as a column, like so:

Table_1 10      20      30
Table_1 30      40      50
Table_1 10      20      40
Table_2 20      30      40
Table_2 10      50      60
Table_2 30      20      10
Table_2 30      10      40
Table_3 20      30      50
Table_3 30      50      40

I could probably hack at this with awk and sed, but I'd prefer to use the tidyr approach.


Solution

  • Create a grouping column based on the occurence of 'Table' substring, create a new column with the first observation from the column grouped by ('grp'), remove the first observation (slice) and convert the type

    library(dplyr)
    library(stringr)
    df1 %>%
         group_by(grp = cumsum(str_detect(V1, 'Table'))) %>% 
          mutate(Table = first(V1)) %>%
          slice(-1) %>%
          ungroup %>%
          type.convert(as.is = TRUE) %>%
          select(Table, everything(), -grp)
    

    -ouptut

    # A tibble: 9 x 4
      Table      V1    V2    V3
      <chr>   <int> <int> <int>
    1 Table_1    10    20    30
    2 Table_1    30    40    50
    3 Table_1    10    20    40
    4 Table_2    20    30    40
    5 Table_2    10    50    60
    6 Table_2    30    20    10
    7 Table_2    30    10    40
    8 Table_3    20    30    50
    9 Table_3    30    50    40
    

    data

    df1 <- read.table('file.txt', header = FALSE, fill = TRUE)
    df1 <- structure(list(V1 = c("Table_1", "10", "30", "10", "Table_2", 
    "20", "10", "30", "30", "Table_3", "20", "30"), V2 = c(NA, 20L, 
    40L, 20L, NA, 30L, 50L, 20L, 10L, NA, 30L, 50L), V3 = c(NA, 30L, 
    50L, 40L, NA, 40L, 60L, 10L, 40L, NA, 50L, 40L)), class = "data.frame", row.names = c(NA, 
    -12L))