Search code examples
htmlrcolorsshinyhighlight

Colorizing and highlighting text in HTML for R Shiny


I ran a bunch of NLP algorithms on a big corpus and I want to explore my results. In other words, I want to be able to format text in HTML based on features that my models have extracted in order to view them in a Shiny web application like this : enter image description here

As I don't know anything about html, can you direct my research by telling me which ways I should consider? Do some R packages exist to do this kind of tasks? Are Shiny functions sufficient? If so, which functions?


Solution

  • Not sure if this answers the question exactly but it may be relevant:

    require(tidyverse)
    require(spacyr)
    require(shiny)
    
    tempDir <- tempfile()
    dir.create(tempDir)
    htmlFile <- file.path(tempDir, "index.html")
    viewer <- getOption("viewer")
    
    s = spacy_parse(sample(stringr::sentences, 25), dependency=T, nounphrase=T) %>% as_tibble()
    print(s)
    
    # A tibble: 224 x 11
       doc_id sentence_id token_id token lemma pos   head_token_id dep_rel entity nounphrase
       <chr>        <int>    <int> <chr> <chr> <chr>         <dbl> <chr>   <chr>  <chr>     
     1 text1            1        1 Brea… brea… VERB              1 ROOT    ""     ""        
     2 text1            1        2 deep  deep  ADJ               1 acomp   ""     ""        
     3 text1            1        3 and   and   CCONJ             1 cc      ""     ""        
     4 text1            1        4 smell smell VERB              1 conj    ""     ""        
     5 text1            1        5 the   the   DET               7 det     ""     "beg"     
     6 text1            1        6 piny  piny  ADJ               7 amod    ""     "mid"     
     7 text1            1        7 air   air   NOUN              4 dobj    ""     "end_root"
     8 text1            1        8 .     .     PUNCT             1 punct   ""     ""        
     9 text10           1        1 Fill  fill  VERB              1 ROOT    ""     ""        
    10 text10           1        2 the   the   DET               4 det     ""     "beg"     
    # … with 214 more rows, and 1 more variable: whitespace <lgl>
    
    cols = sample(rainbow(15))
    pos_types = s %>% count(pos, sort=T) %>% print() %>% .[['pos']]
    
    sink(htmlFile)
    cat('<h4>')
      for(i in 1:nrow(s)){
        col = cols[match(s$pos[i], pos_types)]
        cat(as.character(span(s$token[i], style=glue::glue('color:{col}'), 
                              title = paste(s$pos[i], s$dep_rel[i]))))
        cat(' ')
      }
    cat('</h4>')
    sink()
    
    viewer(htmlFile)
    

    enter image description here