Search code examples
rggplot2data-visualizationarearadar-chart

How to measure the area of a polygon in ggplot2?


Hi everyone, I have a number of samples that I would like to draw a polygon for each of them to illustrate the shape of the data. My data look likes this:

01 0.31707317

02 0.12195122

03 0.09756098

04 0.07317073

05 0.07317073

06 0.07317073

07 0.07317073

08 0.07317073

09 0.04878049

10 0.04878049

I can easily draw a radar chart using radarchart, which looks like this: radarchart

But I am trying to measure the area of the results shape and use that as a measure of data shape. This is where I struggle.

I tried to save the resulting figure as a vector and use the points there but it looks like I can not pass the chart into a vector. Then I tried rgdal package to exprt my figure as a shapefile and use the coordinates there:

coorddf <- SpatialPointsDataFrame(radarchart(as.data.frame(ttradar), pcol=rgb(0.2,0.5,0.5) , pfcol=rgb(0.2,0.5,0.5, 0.2))
, data = radarchart(as.data.frame(ttradar), pcol=rgb(0.2,0.5,0.5) , cglcol = "white", pfcol=rgb(0.2,0.5,0.5, 0.2))

writeOGR(coorddf, dsn = '.', layer = 'mypoints', driver = "ESRI Shapefile")

Which was not a good idea because my data does not have values that can be used as lat and long points..

Any suggestions?


Solution

  • To expand on @G5W's excellent point:

    library(dplyr)
    library(ggplot2)
    
    df <- structure(
      list(
        V1 = 1:10,
        V2 = c(
          0.31707317,
          0.12195122,
          0.09756098,
          0.07317073,
          0.07317073,
          0.07317073,
          0.07317073,
          0.07317073,
          0.04878049,
          0.04878049
        )
      ),
      .Names = c("V1", "V2"),
      class = "data.frame",
      row.names = c(NA, -10L)) 
    

    You can calculate each triangle from its neighbor to the right using dplyr::lead:

    areas <- df %>% 
      setNames(c("variable", "value")) %>% 
      mutate(nextval = lead(value, default = value[1]),
             angle   = (1/10) * (2*pi),
                       # change 1/n to number of variables
             area    = value*nextval*sin(angle)/2)
    
       variable      value    nextval     angle         area
    1         1 0.31707317 0.12195122 0.6283185 0.0113640813
    2         2 0.12195122 0.09756098 0.6283185 0.0034966406
    3         3 0.09756098 0.07317073 0.6283185 0.0020979843
    4         4 0.07317073 0.07317073 0.6283185 0.0015734881
    5         5 0.07317073 0.07317073 0.6283185 0.0015734881
    6         6 0.07317073 0.07317073 0.6283185 0.0015734881
    7         7 0.07317073 0.07317073 0.6283185 0.0015734881
    8         8 0.07317073 0.04878049 0.6283185 0.0010489921
    9         9 0.04878049 0.04878049 0.6283185 0.0006993281
    10       10 0.04878049 0.31707317 0.6283185 0.0045456327
    

    A couple things: notice that I used the default = value[1] to make sure that the NA that would be caused at the end to wrap around to using the first value instead. Also you need to use angles in radians, so that's just 1/n * 2pi. Now that we have all the triangle areas, we can add them:

    areas %>% summarise(total = sum(area))
    
           total
    1 0.02954661
    

    This approach is easily extended to multiple groups to compare.

    df <- expand.grid(var = 1:8, grp = c("a", "b")) %>% 
      mutate(value = runif(length(var), 0.25, 1)) %>% 
      group_by(grp) %>% 
      mutate(nextval = lead(value, default = value[1]),
             angle = (1/8)*(2*pi),
             area = value*nextval*sin(angle)/2) %>% 
      mutate(total = sum(area)) 
    
    # A tibble: 16 x 7
    # Groups:   grp [2]
         var    grp     value   nextval     angle       area     total
       <int> <fctr>     <dbl>     <dbl>     <dbl>      <dbl>     <dbl>
     1     1      a 0.3101167 0.6831233 0.7853982 0.07489956 0.5689067
     2     2      a 0.6831233 0.4166692 0.7853982 0.10063417 0.5689067
     3     3      a 0.4166692 0.4756976 0.7853982 0.07007730 0.5689067
     4     4      a 0.4756976 0.3426595 0.7853982 0.05763002 0.5689067
     5     5      a 0.3426595 0.3107870 0.7853982 0.03765135 0.5689067
     6     6      a 0.3107870 0.3001208 0.7853982 0.03297721 0.5689067
     7     7      a 0.3001208 0.9039894 0.7853982 0.09592115 0.5689067
     8     8      a 0.9039894 0.3101167 0.7853982 0.09911594 0.5689067
     9     1      b 0.9888119 0.3481213 0.7853982 0.12170243 1.1749789
    10     2      b 0.3481213 0.8513316 0.7853982 0.10478143 1.1749789
    11     3      b 0.8513316 0.9928401 0.7853982 0.29883611 1.1749789
    12     4      b 0.9928401 0.6372992 0.7853982 0.22370605 1.1749789
    13     5      b 0.6372992 0.8303906 0.7853982 0.18710303 1.1749789
    14     6      b 0.8303906 0.3607232 0.7853982 0.10590379 1.1749789
    15     7      b 0.3607232 0.2786354 0.7853982 0.03553575 1.1749789
    16     8      b 0.2786354 0.9888119 0.7853982 0.09741033 1.1749789
    
    df %>% 
      ggplot(aes(var, value)) + 
      geom_polygon() +
      geom_text(aes(0,0, label = round(total, 2)), color = "white") +
      facet_grid(~grp) +
      scale_y_continuous("", limits = c(0, 1), expand = c(0,0)) +
      scale_x_continuous("", breaks = 1:8, expand = c(0,0)) +
      theme_minimal() +
      coord_radar()
    

    enter image description here



    If you're doing a lot of these, it's worth looking at the ggradar package: http://www.ggplot2-exts.org/ggradar.html

    Since I was just doing this one-off, I used a polar coordinate modification from Erwan Le Pennec: http://www.cmap.polytechnique.fr/~lepennec/R/Radar/RadarAndParallelPlots.html

    coord_radar <- function (theta = "x", start = 0, direction = 1) 
    {
      theta <- match.arg(theta, c("x", "y"))
      r <- if (theta == "x") 
        "y"
      else "x"
      ggproto("CoordRadar", CoordPolar, theta = theta, r = r, start = start, 
              direction = sign(direction),
              is_linear = function(coord) TRUE)
    }