Search code examples
rggplot2loess

Trying to Add Loess Smoothing Curve to Scatterplot


I am trying to add a loess smoothed fit curve to my scatterplot in R. I can't seem to figure out what's wrong with my code below...For reference, the variables poverty and binge_all are column names of a the data frame correlational_data. I have loaded the ggplot2 package/library.

library(ggplot2)    

p <- ggplot(correlational_data, aes(poverty, binge_all))
p <- p + geom_point(color = "blue")
p <- p + geom_smooth(method = "loess")
p

I used sapply(correlational_data$poverty, class) and sapply(correlational_data$binge_all, class) to determine that poverty and binge_all are of class factor. Not sure if that makes a difference.

Update to show first 10 rows of data

head(correlational_data, 10)
   year                state binge_all poverty
1  2012              Alabama      12.3      19
2  2012               Alaska      16.8    10.1
3  2012              Arizona      15.3    18.7
4  2012             Arkansas      11.8    19.8
5  2012           California      16.9      17
6  2012             Colorado      19.2    13.7
7  2012          Connecticut      17.5    10.7
8  2012             Delaware      18.6      12
9  2012 District of Columbia      23.1    18.2
10 2012              Florida      16.5    17.1

Solution

  • As others pointed out in the comments, binge_all and poverty needs to be numeric, not factor. Here I plot the data using the code and example data you provided.

    # Create example data frame
    correlational_data <- read.table(text = "   year                state binge_all poverty
    1  2012              Alabama      12.3      19
                                     2  2012               Alaska      16.8    10.1
                                     3  2012              Arizona      15.3    18.7
                                     4  2012             Arkansas      11.8    19.8
                                     5  2012           California      16.9      17
                                     6  2012             Colorado      19.2    13.7
                                     7  2012          Connecticut      17.5    10.7
                                     8  2012             Delaware      18.6      12
                                     9  2012 'District of Columbia'      23.1    18.2
                                     10 2012              Florida      16.5    17.1",
                                     header = TRUE, stringsAsFactors = FALSE)
    
    # Check the class
    class(correlational_data$binge_all)
    [1] "numeric"
    class(correlational_data$poverty)
    [1] "numeric"
    
    # Plot the data   
    library(ggplot2)
    
    p <- ggplot(correlational_data, aes(poverty, binge_all))
    p <- p + geom_point(color = "blue")
    p <- p + geom_smooth(method = "loess")
    p
    

    enter image description here

    Notice that if you want to convert your factor column to numeric, please convert to character first. Below is an example:

    correlational_data$binge_all <- as.numeric(as.character(correlational_data$binge_all))
    correlational_data$poverty <- as.numeric(as.character(correlational_data$poverty))
    

    This will make sure you convert the actual numbers, not the level of the factor.