Search code examples
rflextable

How can I create a flextable without a row with long format data?


I have a huge data table in the long format. For each table_id I would subset the data and make a flextable. Using tabulator() is perfect for 90% of my use case but in some of the data, there are no rows which tabulator() requires. Here's a quick example using nonsense dummy data to demonstrate the problem I'm facing:

data_A <- data.frame(
  table_id = rep("A", 4),
  column_header_one = c("Mammal", "Mammal", "Mammal", "Mammal"),
  column_header_two = c("Dog", "Dog", "Cat", "Cat"),
  row_header_one = c("row1", "row1", "row1", "row1"),
  row_header_two = c("row", "row", "row", "row"),
  percentage = runif(4, 0, 100),
  standard_error = runif(4, 0, 10)
)

data_B <- data.frame(
  table_id = rep("B", 2),
  column_header_one = c("Bird", "Bird"),
  column_header_two = c("Parrot", "Pigeon"),
  row_header_one = NA,
  row_header_two = NA,
  percentage = runif(2, 0, 100),
  standard_error = runif(2, 0, 10)
)

df <- rbind(data_A, data_B)
print(df)

modified_data <- tabulator(
  x = data_B,
  rows = ???,
  columns = c("column_header_one", "column_header_two"),
  `Perc` = c("percentage"),
  'Std' = c("standard_error")
)

ft <- as_flextable(modified_data)

Goal:

enter image description here

I'm also open to any alternative strategy where I can use a long format data to create a flextable without a row. Thank you.


Solution

  • For now tabulator() expect a row dimension. So the trick is to create a dummy one and delete the corresponding column later.

    I had to modify data_A. There were 2 rows for dog and two rows for cat, it should be 1 row so that only one row is presented.

    library(flextable)
    data_A <- data.frame(
      table_id = rep("A", 2),
      column_header_one = c("Mammal", "Mammal"),
      column_header_two = c("Dog", "Cat"),
      percentage = runif(2, 0, 100),
      standard_error = runif(2, 0, 10)
    )
    
    data_B <- data.frame(
      table_id = rep("B", 2),
      column_header_one = c("Bird", "Bird"),
      column_header_two = c("Parrot", "Pigeon"),
      percentage = runif(2, 0, 100),
      standard_error = runif(2, 0, 10)
    )
    df <- rbind(data_A, data_B)
    

    let's create a dummy row that will be deleted later

    df$dymmy_row <- "dummy"
    

    rows and columns counts must all be = 1, you can use count() to check that.

    dplyr::count(df, dymmy_row, column_header_one, column_header_two)
    

    It is now possible to use tabulator().

    modified_data <- tabulator(
      x = df,
      rows = "dymmy_row",
      columns = c("column_header_one", "column_header_two"),
      `Perc` = as_paragraph(percentage),
      'Std' = as_paragraph(standard_error)
    )
    

    sep_w is set to 0 as we need that to avoid breaks with merge_h() that will be used later.

    ft <- as_flextable(
      modified_data, 
      sep_w = 0 # 
    )
    ft
    

    enter image description here

    Now the final table:

    delete_columns(ft, 1) |> # delete the dummy row
      merge_h(part = "header") # redo that span instructions that are deleted by `delete_columns()`
    
    

    enter image description here