Search code examples
rggplot2scatter-plotfacetfactors

ggplot: Correct ordering of x-axis for a geom_point with facet_grid


I am struggling to order my x axis correctly for a scatter plot where I would like the discrete x axis labels to be ordered in increasing size of a numeric factor of a particular group in second discrete factor. And for this to be separated by facet_grid (or facet_wrap if this is better in this case?) by a fourth discrete factor. I hope that makes sense? If not, hopefully it will once i explain in the example below.

There seems to be a couple of useful pages of help online where im sure the answer is in there somewhere - but i just cant seem to apply it to work in my case.

Here is my example dataset...

Car = c("A","A","A","B","B","C","C","D","D","E","E","F","F","G","G","G","H","H","H","H","I","I","J","J","J","K","K","K","L","L","M","M","N","N","N","O","O","P","P","Q","Q","R","R","S","S","T","T","U","U","U","V","V","V","V","X","X","X")
Area = c("MMR","QRT","VF","QRT","VF","MMR","QRT","MMR","QRT","MMR","QRT","QRT","VF","MMR","QRT","VF","MMR","QRT","PP","VF","QRT","VF","QRT","PP","VF","MMR","QRT","VF","QRT","VF","QRT","VF","MMR","QRT","VF","QRT","VF","QRT","VF","QRT","VF","MMR","QRT","MMR","QRT","MMR","QRT","MMR","QRT","VF","MMR","QRT","PP","VF","MMR","QRT","VF")
Distance = c(100,0.0022,1320,0.002,1056,1030,0.025,62.1,0.06,80,0.011,7.2,100,671,91.677,165,0.61,0.1102,0.08,11.5,0.173,327,0.159,0.82,0.01902,10,0.0079,23,0.186,0.02235,0.038,0.022,100,0.016,0.01359,0.18,0.02291,0.00048,1000,0.007,8.21,1000,0.0349,100,0.0056,100,0.022,100,0.05,13,17.9,0.032,0.22,87,100,0.09,0.0251)
Country = c("UK","UK","UK","UK","UK","UK","UK","UK","UK","UK","UK","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM")
df=data.frame(Car, Area, Distance, Country)
df

I wish to have a plot where I have 'Car' on the x-axis and the 'Distance' on the Y-axis. The plot I would like to be split by 'Country' using facet_grid and within each facet Id like the x-axis to be ordered by increasing distance of 'QRT' in the 'Area' factor.

The following codes for a plot which is what I am aiming for (except the x axis sorting issue)

Fig2B<- ggplot(df,aes(x=Car,y=Distance,colour=Area)) + 
  coord_trans(y = "log10") +
  geom_point() +
  facet_grid(. ~ Country, scales = "free", space="free")

Plot without correct x axis order

The closest I have gotten to re-ordering this is through the following helpful post.

Using the following code I can create a new factor that appears to order it correctly.

#Remove grouping
ungroup(df) %>%
# 2. Arrange by
#   i.  facet group
#   ii. bar height
arrange(Country, Distance, Area) %>%
# 3. Add order column of row numbers
mutate(order = row_number())

However I can not work out how to take this to the next stage and use it in my plot using the code in the article. I get the following message...

Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 0, 57

Im now not sure where to go from here.


Solution

  • I can create a new factor that appears to order it correctly.

    This is the right goal.

    I'd like the x-axis to be ordered by increasing distance of 'QRT' in the 'Area' factor

    Okay, so we need this ordering.

    order = 
        ## filter down to just QRT
        filter(df, Area == "QRT") %>%
        ## get mean distance for each car (just in case there are
        ## multiple QRT values for a single car - more general than your example)   
        group_by(Car) %>%                   
        summarize(qrtdist = mean(Distance)) %>%
        ## sort ascending
        arrange(qrtdist) %>%
        ## make the Car column a character
        mutate(Car = as.character(Car))
    

    So the Car column of this new order data set should have the correct ordering. Now we apply this ordering to the original data and the plot will work as desired:

    df$Car = factor(df$Car, levels = order$Car)
    
    ggplot(df,aes(x=Car,y=Distance,colour=Area)) + 
      coord_trans(y = "log10") +
      geom_point() +
      facet_grid(. ~ Country, scales = "free", space="free")
    

    Using base

    The above was the fancy dplyr way, but we can actually simplify a lot in this case using base. There is a command reorder() for reordering levels of a factor by a function of some other variable.

    In this case, we want to reorder the df$Car factor, using the values of df$Distance where df$Area is "QRT".

    df$Car = reorder(
        # factor to reorder
        df$Car,  
        # vector that is Distance when Area is "QRT" and NA otherwise
        ifelse(df$Area == "QRT", df$Distance, NA),
        # function of that vector
        FUN = mean,
        # additional FUN argument: remove NA values
        na.rm = TRUE
    )
    

    Without all the comments, we can do this:

    df$Car = reorder(df$Car, ifelse(df$Area == "QRT", df$Distance, NA), mean, na.rm = TRUE)
    
    ggplot(df,aes(x=Car,y=Distance,colour=Area)) + 
      coord_trans(y = "log10") +
      geom_point() +
      facet_grid(. ~ Country, scales = "free", space="free")