Search code examples
rbibtexcitations

Creating formatted references with different citation styles to academic papers without DOIs


I'd like to format references to academic papers in different citation styles with R.

With package rcrossref, I can easily create citations to certain articles based on their DOIs in the style you specify. However, not all papers have a DOI, so I'm looking for an easy way to get citations in text with different styles based on the article info from a BibTeX entry or some other type of input.

Using rcrossref: The package contains length(rcrossref::get_styles()) 2209 different styles.

For example, you can get citations in text to some highly cited papers (DOIs from this source: https://doi.org/10.1038/514550a) with different styles in text in a list element as follows:

library(rcrossref)
# some DOIs of interest
dois <- c("10.1038/514550a", "10.1038/227680a0", "10.1016/0003-2697(76)90527-3",  "10.1073/Pnas.74.12.5463", "10.1016/0003-2697(87)90021-2", "10.1107/S0108767307043930")


# APA cv style
cr_cn(dois = dois, format = "text", style="apa-cv")
# same with Chicago style
cr_cn(dois = dois, format = "text", style="chicago-note-bibliography")
# same with Vancouver style
cr_cn(dois = dois, format = "text", style="vancouver")

Now, say I have an entry without a DOI f.ex. in BibTex format, like:

@article {PMID:14907713,    Title = {Protein measurement with the Folin phenol reagent},    Author = {LOWRY, OH and ROSEBROUGH, NJ and FARR, AL and RANDALL, RJ},   Number = {1},   Volume = {193},     Month = {November},     Year = {1951},  Journal = {The Journal of biological chemistry},    ISSN = {0021-9258},     Pages = {265—275},  URL = {http://www.jbc.org/content/193/1/265.long} }  

and I'd like to format also this entry f.ex in APA cv, Chicago and Vancouver styles, and get the result in text, how can I do that? I Haven't found a function for that. Is there any way currently available for this task?

Thank you!


Solution

  • So it doesn't look like rcrossref supports this because everything happens on their API server and there doesn't appear to be a way to specify a raw bibtex entry that doesn't have a DOI.

    However, it does appear that pandoc which is usually installed with RStudio and is used by rmarkdown has support for citation formatting. I tried to do a bit of reverse engineering to see if it would be possible to just produce the citation for a given entry. Here's the function I've created.

    citation <- function(bib, csl="chicago-author-date.csl", toformat="plain", cslrepo="https://raw.githubusercontent.com/citation-style-language/styles/master") {
      if (!file.exists(bib)) {
        message("Assuming input is literal bibtex entry")
        tmpbib <- tempfile(fileext = ".bib")
        on.exit(unlink(tmpbib), add=TRUE)
        if(!validUTF8(bib)) {
          bib <- iconv(bib, to="UTF-8")
        }
        writeLines(bib, tmpbib)
        bib <- tmpbib
      }
      if (tools::file_ext(csl)!="csl") {
        warning("CSL file name should end in '.csl'")
      }
      if (!file.exists(csl)) {
        cslurl <- file.path(cslrepo, csl)
        message(paste("Downling CSL from", cslurl))
        cslresp <- httr::GET(cslurl, httr::write_disk(csl))
        if(httr::http_error(cslresp)) {
          stop(paste("Could not download CSL.", "Code:", httr::status_code(cslresp)))
        }
      }
      tmpcit <- tempfile(fileext = ".md")
      on.exit(unlink(tmpcit), add=TRUE)
      
      writeLines(c("---","nocite: '@*'","---"), tmpcit)
      rmarkdown::find_pandoc()
      command <- paste(shQuote(rmarkdown:::pandoc()), 
                       "--filter", "pandoc-citeproc",
                       "--to", shQuote(toformat),
                       "--csl", shQuote(csl),
                       "--bibliography", shQuote(bib), 
                      shQuote(tmpcit))
      rmarkdown:::with_pandoc_safe_environment({
        result <- system(command, intern = TRUE)
        Encoding(result) <- "UTF-8"
      })
      result
    }
    

    You can pass in your reference, and it will convert it using a standard "CSL" file. These CSL files are what control the formatting. There is a giant repo with different CSL for different formats here. You can specify a CSL file and if the file doesn't exist, this function will automatically download it from the repo.

    You can either pass in a "raw" citation

    test <- "@article {PMID:14907713,    Title = {Protein measurement with the Folin phenol reagent},    Author = {LOWRY, OH and ROSEBROUGH, NJ and FARR, AL and RANDALL, RJ},   Number = {1},   Volume = {193},     Month = {November},     Year = {1951},  Journal = {The Journal of biological chemistry},    ISSN = {0021-9258},     Pages = {265-275},  URL = {http://www.jbc.org/content/193/1/265.long} } "
    citation(test)
    

    Or if the data was in a file, you could use the file name

    writeLines(test, "test.bib") 
    citation("test.bib")
    

    And if you wanted to use a different CSL, you can just set the name of the CSL file in the CSL= parameter

    citation("test.bib", csl="apa-cv.csl")
    citation("test.bib", csl="chicago-note-bibliography.csl")
    citation("test.bib", csl="vancouver.csl")