I have written a manuscript using bookdown in Rstudio for a specific project that cites references from a bibtex file. This is a single .bib file that I use for many documents, so it is outside my project folder and contains many references that aren't cited in the present manuscript. To make this easier to share, I would like to make a smaller .bib file showing only those references I actually cite in the manuscript.
Other questions have addressed how to do this for:
.aux
file. I can generate an .aux
file by setting options(tinytex.clean = FALSE)
, but it doesn't contain any citations.Does anyone know of a way to do this for an Rmarkdown document? Thanks!
I am using this YAML header and knitting within Rstudio:
output:
bookdown::pdf_book:
keep_tex: yes
Full sessionInfo:
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.3 bookdown_0.20 htmltools_0.4.0 tools_3.6.3 yaml_2.2.0
[6] Rcpp_1.0.3 rmarkdown_2.3 knitr_1.29 xfun_0.15 digest_0.6.25
[11] packrat_0.5.0 rlang_0.4.7 evaluate_0.14
Since you write in .Rmd you can use the following R-function to clean up your bib-file:
library(stringr)
clean_bib <- function(input_file, input_bib, output_bib){
lines <- paste(readLines(input_file), collapse = "")
entries <- unique(str_match_all(lines, "@([a-zA-Z0-9]+)[,\\. \\?\\!\\]\\;]")[[1]][, 2])
bib <- paste(readLines(input_bib), collapse = "\n")
bib <- unlist(strsplit(bib, "\n@"))
output <- sapply(entries, grep, bib, value = T)
output <- paste("@", output, sep = "")
writeLines(unlist(output), output_bib)
}
# now call the function
clean_bib(...)
Just call it in the setup chunk.
What does the function do? It first searches all citations in the input-file, meaning a string starting with @, containing letters and numbers and ending with a comma, dot, question mark, exclamation mark, space or ] -- adjust this to your needs.
Then it constructs a new bib file only containing these entries.