I have a huge data table in the long format. For each table_id I would subset the data and make a flextable. Using tabulator() is perfect for 90% of my use case but in some of the data, there are no rows which tabulator() requires. Here's a quick example using nonsense dummy data to demonstrate the problem I'm facing:
data_A <- data.frame(
table_id = rep("A", 4),
column_header_one = c("Mammal", "Mammal", "Mammal", "Mammal"),
column_header_two = c("Dog", "Dog", "Cat", "Cat"),
row_header_one = c("row1", "row1", "row1", "row1"),
row_header_two = c("row", "row", "row", "row"),
percentage = runif(4, 0, 100),
standard_error = runif(4, 0, 10)
)
data_B <- data.frame(
table_id = rep("B", 2),
column_header_one = c("Bird", "Bird"),
column_header_two = c("Parrot", "Pigeon"),
row_header_one = NA,
row_header_two = NA,
percentage = runif(2, 0, 100),
standard_error = runif(2, 0, 10)
)
df <- rbind(data_A, data_B)
print(df)
modified_data <- tabulator(
x = data_B,
rows = ???,
columns = c("column_header_one", "column_header_two"),
`Perc` = c("percentage"),
'Std' = c("standard_error")
)
ft <- as_flextable(modified_data)
Goal:
I'm also open to any alternative strategy where I can use a long format data to create a flextable without a row. Thank you.
For now tabulator()
expect a row dimension. So the trick is to create a dummy one and delete the corresponding column later.
I had to modify data_A
. There were 2 rows for dog and two rows for cat, it should be 1 row so that only one row is presented.
library(flextable)
data_A <- data.frame(
table_id = rep("A", 2),
column_header_one = c("Mammal", "Mammal"),
column_header_two = c("Dog", "Cat"),
percentage = runif(2, 0, 100),
standard_error = runif(2, 0, 10)
)
data_B <- data.frame(
table_id = rep("B", 2),
column_header_one = c("Bird", "Bird"),
column_header_two = c("Parrot", "Pigeon"),
percentage = runif(2, 0, 100),
standard_error = runif(2, 0, 10)
)
df <- rbind(data_A, data_B)
let's create a dummy row that will be deleted later
df$dymmy_row <- "dummy"
rows and columns counts must all be = 1, you can use count() to check that.
dplyr::count(df, dymmy_row, column_header_one, column_header_two)
It is now possible to use tabulator()
.
modified_data <- tabulator(
x = df,
rows = "dymmy_row",
columns = c("column_header_one", "column_header_two"),
`Perc` = as_paragraph(percentage),
'Std' = as_paragraph(standard_error)
)
sep_w
is set to 0 as we need that to avoid breaks with merge_h()
that will be used later.
ft <- as_flextable(
modified_data,
sep_w = 0 #
)
ft
Now the final table:
delete_columns(ft, 1) |> # delete the dummy row
merge_h(part = "header") # redo that span instructions that are deleted by `delete_columns()`