Search code examples
htmlrplotlyr-markdownhtmlwidgets

How to reduce the size of R plotly html output by rounding plot values


I am trying to reduce the size of my RMarkdown html report and more importantly, make them faster to open. The html report consists of a large number of R Plotly plots with each plot containing a large number of data points (1000+). Considering that R Plotly stores all of the raw data for each plot within the html file, I believed a good option to reduce the file size was to round the decimal places in the data. However, I found that even though the input data was rounded, R Plotly still maintains a large number of decimals places in the html file. Consequently the file size is not reduced if data is rounded.

See below for 2 cases, the base containing raw data, and the rounding case containing rounded data. The file size is the same for both cases.

Base Case HTML

RawData <- data.frame(Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
                      PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411,0.99476560, 0.42941527))
RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
  add_trace(x = ~Date, y = ~PreciseValue, name = 'PreciseValue')
saveWidget(fig, "plotly_base.html", selfcontained = TRUE)

The html file size is 3780kb. If I open the html file and look at the underlying R Plotly data, the stored y data is:

"y":[0.15162704353000001,0.35426295622999998,0.83393426323999997,0.57968136341999998,0.39334726234,0.29371352347000002,0.17792423404999999,0.44352285533000002,0.68418423485000002,0.36623994110000002,0.99476432455999997,0.42941523452699998]

Notice that there are more decimals places than in the original data.

Rounding Values Case

RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
  add_trace(x = ~Date, y = ~RoundValue, name = 'RoundValue')

saveWidget(fig, "plotly_round.html", selfcontained = TRUE)

The html file size for the round case is also 3780kb. The underlying data for this case is

"y":[0.14999999999999999,0.34999999999999998,0.82999999999999996,0.57999999999999996,0.39000000000000001,0.28999999999999998,0.17999999999999999,0.44,0.68000000000000005,0.37,0.98999999999999999,0.42999999999999999]

The stored y data should be something like

"y":[0.15, 0.35, 0.83, 0.58, 0.39, 0.29, 0.18, 0.44, 0.68, 0.37, 0.99, 0.43]

Does anyone know how to configure R Plotly to only store the configured number of decimal places in html output?


Solution

  • plotly uses htmlwidgets methods to save its data. Part of that is that the object being plotted contains a function to save its data as JSON. The function used in your example is

    > attr(fig$x, "TOJSON_FUNC")
    function (x, ...) 
    {
        jsonlite::toJSON(x, digits = 50, auto_unbox = TRUE, force = TRUE, 
            null = "null", na = "null", time_format = "%Y-%m-%d %H:%M:%OS6", 
            ...)
    }
    <bytecode: 0x12084ae50>
    <environment: namespace:plotly>
    

    You can replace that with a different function, e.g. one that looks just the same, but only tries to save 2 significant digits instead of 50 significant digits:

    attr(fig$x, "TOJSON_FUNC") <- function (x, ...) 
    {
      jsonlite::toJSON(x, digits = 2, auto_unbox = TRUE, force = TRUE, 
                       null = "null", na = "null", time_format = "%Y-%m-%d %H:%M:%OS6", 
                       ...)
    }
    
    saveWidget(fig, "plotly_base.html", selfcontained = TRUE)
    

    When I do that, it saves the data with just 2 decimal places.

    This doesn't make a huge difference to the file size since most of it is the plotly Javascript code, but on a larger dataset (maybe your real one) it should help a bit.