Search code examples
ggplot2quanteda

Plot wordfish thetas (not scaled)


I have conducted a wordfish analysis with quanteda.

Through the function textplot_scale1d() I was able to plot the estimated thetas.

# run wordfish
tmod_wf <- textmodel_wordfish(example_DFM, dir = c(2, 1))

# plot the estimated thetas (scaled/ordered):
# y axis -> the documents
# x axis -> the estimated thetas
tmod_wf %>% 
  textplot_scale1d(margin = c("documents", "features"))

Now I would like to plot a graph in which the estimated thetas appear in the y axis and the documents appear in the x axis (not ordered based on their estimated thetas).

plot(tmod_wf$theta, xlab = "Documents", ylab = "Estimated_thetas")

the above line creates the following scatterplot:

scatterplot (thetas and documents)

The thetas are on the y axis while the documents are on the x axis (ordered as they appear in the corpus). The scatterplot suits my needs, however is rather empty, I would like to embellish it: highlight some documents, change the dots' size etc...

Is it possible to convert such a plot to ggplot?

I tried the following:

tmod_wf$theta %>% 
  ggplot() +
  geom_point(aes(x = docs, y = theta))

However, the following error appears:

Error in `fortify()`:
! `data` must be a <data.frame>, or an object coercible by `fortify()`, not a double
  vector.

Solution

  • Your data has to be data.frame

    Try to do this:

    tmod1 <- textmodel_wordfish(quanteda::data_dfm_lbgexample, dir = c(1,5))
    
    # extract theta and docs
    df = tibble(theta = tmod1[["theta"]],
                docs = tmod1[["docs"]])
    

    Then plot with ggplot:

    df %>%
      ggplot(aes(x = theta, y = docs, fill = -theta)) +
      geom_col() +
      scale_fill_distiller(palette = "RdYlGn") +
      theme_minimal() +
      theme(legend.position = "none")
    

    bar plot