I have some human-readable tab-separated data with the following form:
Table_1
10 20 30
30 40 50
10 20 40
Table_2
20 30 40
10 50 60
30 20 10
30 10 40
Table_3
20 30 50
30 50 40
I would like to preserve the table names as a column, like so:
Table_1 10 20 30
Table_1 30 40 50
Table_1 10 20 40
Table_2 20 30 40
Table_2 10 50 60
Table_2 30 20 10
Table_2 30 10 40
Table_3 20 30 50
Table_3 30 50 40
I could probably hack at this with awk
and sed
, but I'd prefer to use the tidyr
approach.
Create a grouping column based on the occurence of 'Table' substring, create a new column with the first
observation from the column grouped by ('grp'), remove the first observation (slice
) and convert the type
library(dplyr)
library(stringr)
df1 %>%
group_by(grp = cumsum(str_detect(V1, 'Table'))) %>%
mutate(Table = first(V1)) %>%
slice(-1) %>%
ungroup %>%
type.convert(as.is = TRUE) %>%
select(Table, everything(), -grp)
-ouptut
# A tibble: 9 x 4
Table V1 V2 V3
<chr> <int> <int> <int>
1 Table_1 10 20 30
2 Table_1 30 40 50
3 Table_1 10 20 40
4 Table_2 20 30 40
5 Table_2 10 50 60
6 Table_2 30 20 10
7 Table_2 30 10 40
8 Table_3 20 30 50
9 Table_3 30 50 40
df1 <- read.table('file.txt', header = FALSE, fill = TRUE)
df1 <- structure(list(V1 = c("Table_1", "10", "30", "10", "Table_2",
"20", "10", "30", "30", "Table_3", "20", "30"), V2 = c(NA, 20L,
40L, 20L, NA, 30L, 50L, 20L, 10L, NA, 30L, 50L), V3 = c(NA, 30L,
50L, 40L, NA, 40L, 60L, 10L, 40L, NA, 50L, 40L)), class = "data.frame", row.names = c(NA,
-12L))