Search code examples
rggplot2geometrytrigonometrypolar-coordinates

In ggplot, how to draw a circle/disk with a line that divides its area according to a given ratio and colored points inside?


I want to visualize proportions using points inside a circle. For example, let's say that I have 100 points that I wish to scatter (somewhat randomly jittered) in a circle.

100pointsbw

Next, I want to use this diagram to represent the proportions of people who voted Biden/Harris in 2020 US presidential elections, in each state.

Example #1 -- Michigan
Biden got 50.62% of Michigan's votes. I'm going to draw a horizontal diameter that splits the circle to two halves, and then color the points under the diameter in blue (Democrats' color).

michigan


Example #2 -- Wyoming
Unlike Michigan, in Wyoming Biden got only 26.55% of the votes, which is approximately a quarter of the vote. In this case I'd draw a horizontal chord that divides the circle such that the disk's area under the chord is 25% of the entire disk area. Then I'll color the respective points in that area in blue. Since I have 100 points in total, 25 points represent the 25% who voted Biden in Wyoming.

wyoming


My question: How can I do this with ggplot? I researched this issue, and there's a lot of geometry going on here. First, the kind of area I'm talking about is called a "circular segment". Second, there are many formulas to calculate its area, if we know some other parameters about the shape (such as the radius length, etc.). See this nice demo.

However, my goal isn't to solve geometry problems, but just to represent proportions in a very specific way:

  1. draw a circle
  2. sprinkle X number of points inside
  3. draw a (real or invisible) horizontal line that divides the circle/disk area according to a given proportion
  4. ensure that the points are arranged respective to the split. That is, if we want to represent a 30%-70% split, then have 30% of the points under the line that divides the disk.
  5. color the points under the line.

I understand that this is somewhat an exotic visualization, but I'll be thankful for any help with this.


EDIT


I've found a reference to a JavaScript package that does something very similar to what I'm asking.


Solution

  • I took a crack at this for fun. There's a lot more that could be done. I agree that this is not a great way to visualize proportions, but if it's engaging your audience ...

    Formulas for determining appropriate heights are taken from Wikipedia. In particular we need the formulas

    a/A = (theta - sin(theta))/(2*pi)
    h = 1-cos(theta/2)
    

    where a is the area of the segment; A is the whole area of the circle; theta is the angle described by the arc that defines the segment (see Wikipedia for pictures); and h is the height of the segment.

    Machinery for finding heights.

    afun <- function(x) (x-sin(x))/(2*pi)
    ## curve(afun, from=0, to = 2*pi)
    find_a <- function(a) {
        uniroot(
            function(x) afun(x) -a,
            interval=c(0, 2*pi))$root
    }
    find_h <- function(a) {
        1- cos(find_a(a)/2)
    }
    vfind_h <- Vectorize(find_h)
    ## find_a(0.5)
    ## find_h(0.5)
    ## curve(vfind_h(x), from = 0, to= 1)
    

    set up a circle

    dd <- data.frame(x=0,y=0,r=1)
    library(ggforce)
    library(ggplot2); theme_set(theme_void())
    gg0 <- ggplot(dd) + geom_circle(aes(x0=x,y0=y,r=r)) + coord_fixed()
    

    finish

    props <- c(0.2,0.5,0.3)  ## proportions
    n <- 100                 ## number of points to scatter
    cprop <- cumsum(props)[-length(props)]
    h <- vfind_h(cprop)
    set.seed(101)
    r <- runif(n)
    th <- runif(n, 0, 2 * pi)
      
    dd <- 
     data.frame(x = sqrt(r) * cos(th), 
                y = sqrt(r) * sin(th))
    
    dd2 <- data.frame(x=r*cos(2*pi*th), y = r*sin(2*pi*th))
    dd2$g <- cut(dd2$y, c(1, 1-h, -1))
    gg0 + geom_point(data=dd2, aes(x, y, colour = g), size=3)
    

    There are a bunch of tweaks that would make this better (meaningful names for the categories; reverse the axis order to match the plot; maybe add segments delimiting the sections, or (more work) polygons so you can shade the sections.

    You should definitely check this for mistakes — e.g. there are places where I may have used a set of values where I should have used their first differences, or vice versa (values vs cumulative sum). But this should get you started.

    circle with points representing proportions