Search code examples
rldapyldavis

II there a way to get a standalone html version of the serVis visual using R?


The LDAvis package in R creates a visualisation of a topic modelling analysis using LDA with the serVis function. If you save this visualisation you end up with a folder with css and html files.

  json <- createJSON(
    phi = tmResult$terms,
    theta = tmResult$topics,
    doc.length = rowSums(as.matrix(dtm)),
    vocab = colnames(as.matrix(dtm)),
    term.frequency = colSums(as.matrix(dtm)),
    plot.opts = list(xlab = "", ylab = "")

  serVis(json, out.dir = 'LDAvis', open.browser = FALSE)

files

If you open the index.html file, you get a blank page due to browser restrictions. For a more elaborate description read:

LDAvis HTML output from serVis is blank

The easiest solution is to change the browser restrictions. However, I want to share this visual with 100+ people. It is not feasable to ask them all to change their browser setting.

In the Python version (PyDavis) it is easily solved by creating a standalone html page, which can easily be shared.

Export pyLDAvis graphs as standalone webpage

Is there a way to get a standalone html version of the serVis visual using R?

EDIT: reproducable data/script:

# Install and load required packages
library(LDAvis)
library(tm)
library(topicmodels)

# Set seed for reproducibility
set.seed(123)


# Create fake data
documents <- c("This is the first document.",
               "The second document is here.",
               "And this is the third one.",
               "Is this the first document?")

# Create a Document-Term Matrix
corpus <- Corpus(VectorSource(documents))
dtm <- DocumentTermMatrix(corpus)

#lda
num_topics <- 3
topicModel <- LDA(dtm, k = num_topics, control = list(seed = 9999))
tmResult <- posterior(topicModel)

# Create fake JSON data
json <- createJSON(
  phi = tmResult$terms,
  theta = tmResult$topics,
  doc.length = rowSums(as.matrix(dtm)),
  vocab = colnames(as.matrix(dtm)),
  term.frequency = colSums(as.matrix(dtm)),
  plot.opts = list(xlab = "", ylab = "")
)

# Save fake visualization to a folder
serVis(json, out.dir = 'test', open.browser = TRUE)

Solution

  • Update

    I have created a pull request to add a stand.alone argument to serVis(). Until this request is merged in, you could install the update pak::pkg_install("the-mad-statter/LDAVis") and then run serVis(json, out.dir = 'test', open.browser = TRUE, stand.alone = TRUE).

    Otherwise, I think you have three options here:

    1. Enable CORS

    Instruct your recipients to enable CORS in their browers. However, this is not recommended as:

    • It is technical and instructions differ depending on browser.
    • Leaves the recipients vulnerable to CORS attacks should they leave it enabled.
    • You have already decided not to go this route.

    2. Host

    Host the R outputed files somewhere. Some options include:

    3. Retool

    Retool the {LDAvis} package files.

    The following script is a bit hackish, but it does succeed in collating all of the individual files inline into a new standalone index2.html document as desired.

    index_html <- readLines("test/index.html")
    # remove d3.v3.js, ldavis.js, and lda.css includes
    index_html[6:8] <- ""
    
    # insert d3.v3.js script inline into html
    d3_v3_js <- readLines("test/d3.v3.js")
    d3_v3_js <- append(d3_v3_js, "<script>", 0)
    d3_v3_js <- append(d3_v3_js, "</script>")
    index_html <- append(index_html, d3_v3_js, 6)
    
    # read and format lda.json to single line string
    lda_json <- jsonlite::fromJSON("test/lda.json")
    lda_json <- jsonlite::toJSON(lda_json)
    lda_json <- stringi::stri_escape_unicode(lda_json)
    
    # insert formatted lda.json into ldavis.js and 
    # then insert ldvis.js script inline into html
    ldavis_js <- readLines("test/ldavis.js")
    # replace beginning of call to d3.json() with actual json
    ldavis_js[95] <- sprintf('    data = JSON.parse("%s");', lda_json)
    # remove end of call to d3.json()
    ldavis_js[1357] <- ""
    ldavis_js <- append(ldavis_js, "<script>", 0)
    ldavis_js <- append(ldavis_js, "</script>")
    index_html <- append(index_html, ldavis_js, 7 + length(d3_v3_js))
    
    # insert lda.css inline into html
    lda_css <- readLines("test/lda.css", warn = FALSE)
    lda_css <- append(lda_css, "<style>", 0)
    lda_css <- append(lda_css, "</style>")
    index_html <- append(index_html, lda_css, 8 + length(d3_v3_js) + length(ldavis_js))
    
    # write out stand alone html
    writeLines(index_html, "test/index2.html")