Search code examples
rggplot2ggrepel

Underline part of text label in ggplot


I am trying to make a label that is made up of a book title and book author. I would like to underline the title, but not the author, in the label.

Here is the MWE data:

Title,Author,Pages,Date Started,Date Finished
underline('Time Travel'),'James Gleick',353,1/1/17,1/27/17
underline('The Road'),'Cormac McCarthy',324,1/28/17,3/10/17

This code works but does not allow for the title and author

library(ggplot2)
library(tidyverse)
library(ggrepel)
library(ggalt)

books.2017 <- read_csv('books_2017.csv')
books.2017$`Date Started` <- as.Date(books.2017$`Date Started`, "%m/%d/%y")
books.2017$`Date Finished` <- as.Date(books.2017$`Date Finished`, "%m/%d/%y")

ggplot(books.2017, aes(x=`Date Started`, xend=`Date Finished`)) +
  geom_dumbbell(aes(size=Pages),size_x=0, size_xend=0) +
  geom_text_repel(aes(label=paste(Title)), parse=TRUE)

When I try to change geom_text_repel to something like:

geom_text_repel(aes(label=paste(Title,Author)), parse=TRUE)

I get this error:

Error in parse(text = as.character(lab)) : 
  <text>:1:26: unexpected string constant
1: underline('Time Travel') 'James Gleick'
                             ^

EDIT The labels should look something like this

tesplot


Solution

  • It looks like you are trying to pull down your goodreads data, and map out the number of books you read over the year, against start data, end data and book size.

    To do what you propose, you can use the parse option on geom_text*(, to do this you have to create a parse string with sprintf() and pass that to geom_text*( as the label input where parse = TRUE.

    To add a newline you might consider using plotmath::over()

    parseLabel <- sprintf("over(%s,%s)",
                     gsub(" ", "~", books.2007$Title, fixed = TRUE),
                     gsub(" ", "~", books.2007$Author, fixed = TRUE))
    parseLabel
    

    alternatively, you can use underline, however adding a newline is tricky as plotmath() does not directly support the use of newline in a parse formula.

    parseLabel <- sprintf("underline(%s)~\n~%s",
                          gsub(" ", "~", books.2007$Title, fixed = TRUE),
                          gsub(" ", "~", books.2007$Author, fixed = TRUE))
    parseLabel
    

    Note: Baptiste correctly hilights this in his answer I am just expanding upon his work here using an example dataset I created.

    OK, here is a quick example based on the above assumptions. I hope this points you in the right direction.

    Note: I have appended an example dataset for people to use.

    Adding an Underline

    In order to add an underline to the text, you can harness plotmath by setting parse=true in the geom_label*() call.

    Simple example using plotmath wih geom_label

    library(tidyverse) # Loads ggplot2
    library(graphics)
    library(ggrepel)
    library(gtable)
    library(ggalt)
    
    # load test dataset
    # ... See example data set
    # books.2007 <- structure...
    
    gp <- ggplot(books.2007)
    gp <- gp + geom_dumbbell( aes(x = `Date Started`, 
                                  xend = `Date Finished`, 
                                  y = ISBN, 
                                  size = as.numeric(Pages)), 
                              size_x = 0, size_xend = 0)
    
    # Construct parseLabel using sprintf
    parseLabel <- sprintf("underline(%s)~\n~%s",
                      gsub(" ", "~", books.2007$Title, fixed = TRUE),
                      gsub(" ", "~", books.2007$Author, fixed = TRUE))
    
    gp <- gp + geom_label(aes(x = `Date Started`,
                              y = ISBN), 
                          label = parseLabel,
                          vjust = 1.5, hjust = "inward", parse = TRUE)
    gp <- gp + labs(size = "Book Size")
    gp
    

    Example Plot Output

    enter image description here

    Simple example using plotmath with geom_label_repel

    nb. My personal sense would be geom_text is easier to use as geom_label_repel requires computation overhead to calculate the positioning of the labels.

    ## Construct parse string
    ##
    ##
    parseLabel <- sprintf("underline(%s)~\n~%s",
                          gsub(" ", "~", books.2007$Title, fixed = TRUE),
                          gsub(" ", "~", books.2007$Author, fixed = TRUE))
    parseLabel
    
    rm(gp)
    gp <- ggplot(books.2007)
    gp <- gp + geom_dumbbell( aes(x = `Date Started`,
                                  xend = `Date Finished`,
                                  y = ISBN,
                                  size = as.numeric(Pages)),
                              size_x = 0, size_xend = 0)
    gp <- gp + geom_label_repel(aes(x = `Date Started`,
                                    y = ISBN),
                                label = parseLabel,
                                # max.iter = 100,
                                parse = TRUE)
    gp <- gp + labs(size = "Book Size")
    gp
    

    Example Plot Output with geom_text_repel

    enter image description here

    Example Data Set:

    books.2007 <- structure(list(Title = c("memoirs of a geisha", "Blink: The Power of Thinking Without Thinking", 
    "Power of One", "Harry Potter and the Half-Blood Prince (Book 6)", 
    "Dune (Dune Chronicles Book 1)"), Author = c("arthur golden", 
    "Malcolm Gladwell", "Bryce Courtenay", "J.K. Rowling", "Frank Herbert"
    ), ISBN = c("0099498189", "0316172324", "034541005X", "0439785960", 
    "0441172717"), `My Rating` = c(4L, 3L, 5L, 4L, 5L), `Average Rating` = c(4, 
    4.17, 5, 4.38, 4.55), Publisher = c("vintage", "Little Brown and Company", 
    "Ballantine Books", "Scholastic Paperbacks", "Ace"), Binding = c("paperback", 
    "Hardcover", "Paperback", "Paperback", "Paperback"), `Year Published` = c(2005L, 
    2005L, 1996L, 2006L, 1990L), `Original Publication Year` = c(2005L, 
    2005L, 1996L, 2006L, 1977L), `Date Read` = c(NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_), `Date Added` = structure(c(13558, 
    13558, 13558, 13558, 13558), class = "Date"), Bookshelves = c("fiction", 
    "nonfiction marketing", "fiction", "fiction fantasy", "fiction scifi"
    ), `My Review` = c(NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_), `Date Started` = structure(c(13577, 
    13610, 13634, 13684, 13722), class = "Date"), `Date Finished` = structure(c(13623, 
    13647, 13660, 13689, 13784), class = "Date"), Pages = c("522", 
    "700", "300", "145", "700")), .Names = c("Title", "Author", "ISBN", 
    "My Rating", "Average Rating", "Publisher", "Binding", "Year Published", 
    "Original Publication Year", "Date Read", "Date Added", "Bookshelves", 
    "My Review", "Date Started", "Date Finished", "Pages"), row.names = c(NA, 
    -5L), spec = structure(list(cols = structure(list(Title = structure(list(), class = c("collector_character", 
    "collector")), Author = structure(list(), class = c("collector_character", 
    "collector")), ISBN = structure(list(), class = c("collector_character", 
    "collector")), `My Rating` = structure(list(), class = c("collector_integer", 
    "collector")), `Average Rating` = structure(list(), class = c("collector_double", 
    "collector")), Publisher = structure(list(), class = c("collector_character", 
    "collector")), Binding = structure(list(), class = c("collector_character", 
    "collector")), `Year Published` = structure(list(), class = c("collector_integer", 
    "collector")), `Original Publication Year` = structure(list(), class = c("collector_integer", 
    "collector")), `Date Read` = structure(list(), class = c("collector_character", 
    "collector")), `Date Added` = structure(list(), class = c("collector_character", 
    "collector")), Bookshelves = structure(list(), class = c("collector_character", 
    "collector")), `My Review` = structure(list(), class = c("collector_character", 
    "collector"))), .Names = c("Title", "Author", "ISBN", "My Rating", 
    "Average Rating", "Publisher", "Binding", "Year Published", "Original Publication Year", 
    "Date Read", "Date Added", "Bookshelves", "My Review")), default = structure(list(), class = c("collector_guess", 
    "collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df", 
    "tbl", "data.frame"))
    

    Simple Example - no formatting

    For completeness here is how I would approach the problem avoiding the formula construction problems.

    gp <- ggplot(books.2007)
    gp <- gp + geom_dumbbell( aes(x = `Date Started`, 
                                  xend = `Date Finished`, 
                                  y = ISBN, 
                                  size = as.numeric(Pages)), 
                              size_x = 0, size_xend = 0)
    t <- paste(books.2007$Title, "\n", books.2007$Author)
    gp <- gp + geom_label(aes(x = `Date Started`,
                                   y = ISBN),
                          label = t,
                          vjust = 1.5, hjust = "inward", parse = FALSE)
    gp <- gp + labs(size = "Book Size")
    gp
    

    Plot Output

    enter image description here