Search code examples
rr-markdownknitrpandoc

How to parse a text table in rmarkdown code chunks


There is an rmarkdown file with a markdown table that will be regularly updated. The content should be parsed in a code chunk so that e.g. ggplot could be used. I don't want to maintain the table in a code chunk or a separate file.

How can I read the table from the code chunk?

You can find as a starter rmarkdown code with a markdown table below.

---
title: "Parse tables"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
```

# Step 1: Create markdown table as text

That table will be manually updated directly in the markdown file.

Table: Project Timeline

| date       | description |
|------------|-------------|
| 2020-05-11 | Milestone 1 |
| 2020-07-11 | Milestone 2 |
| 2020-07-20 | Milestone 3 |


# Step 2: Parse the table above

The table should be maintained as a markdown table. That seems to be more easy than working directly with
`tibble` or `tribble`. How can I read the table from the code chunk? 

```{r}
library(tidyverse)
df <- tibble(date = c("2020-05-11", "2020-07-11", "2020-07-20"), 
             description = c("Milestone 1", "Milestone 2", "Milestone 3"))
df
```

enter image description here


Solution

  • In a code chunk, apply readLines to your Rmd file to get the lines of this file in a vector:

    allLines <- readLines("yourFile.Rmd")
    

    Select the lines which start and end with |, and remove the second one (which is the separator line "|-----|-----|"):

    tableLines <- allLines[grep("^\\|.*\\|$", allLines)][-2]
    

    Then with the code below, you get the table as a matrix, whose first line contains the column names:

    tableAsMatrix <- t(sapply(strsplit(tableLines, "\\|"), function(pieces){
      stringr::str_trim(pieces[-1])
    }))
    

    Finally convert this matrix deprived of its first line to a dataframe, and use its first line to set the column names:

    setNames(as.data.frame(tableAsMatrix[-1,,drop = FALSE]), tableAsMatrix[1,])
    

    Full code

    ---
    title: "Parse tables"
    output: html_document
    ---
    
    ```{r setup, include=FALSE}
    knitr::opts_chunk$set(message = FALSE, warning = FALSE)
    ```
    
    # Step 1: Create markdown table as text
    
    That table will be manually updated directly in the markdown file.
    
    Table: Project Timeline
    
    | date       | description |
    |------------|-------------|
    | 2020-05-11 | Milestone 1 |
    | 2020-07-11 | Milestone 2 |
    | 2020-07-20 | Milestone 3 |
    
    
    # Step 2: Parse the table above
    
    The table should be maintained as a markdown table. How can I read the table from the code chunk? 
    
    ```{r}
    allLines <- readLines("ParseTable.Rmd")
    
    tableLines <- allLines[grep("^\\|.*\\|$", allLines)][-2]
    
    tableAsMatrix <- t(sapply(strsplit(tableLines, "\\|"), function(pieces){
      stringr::str_trim(pieces[-1])
    }))
    
    df <- setNames(as.data.frame(tableAsMatrix[-1,,drop = FALSE]), tableAsMatrix[1,])
    
    df
    ```
    

    enter image description here