Search code examples
rggplot2stacked

Stacked density chart from SQL query


I have a result from an SQL query that looks like this (a test query with no particular meaning):

 week       cash      ccard      fcard      mobile     total   
    9       3.45       0.00       0.00       0.00       3.45
   10      13.02      17.18       4.32      21.24      55.76
   11      47.61      24.52      12.32      32.18     116.63
   12      21.32      61.96      17.32       1.40     102.00
   13     181.80       1.70     275.20       3.50     462.20
   14     390.14     191.80      10.08     100.40     692.42
   15     102.40     207.80     101.40       0.00     411.60

The result of this query goes into a data frame which I want to plot as a stacked density chart where 'week' should be on the X axis and Y axies would be fractions 'cash/total', 'ccard/total' and so on. How would I do that? I googled but all the examples I found so far do not seem to applicable to SQL outputs.

Thanks in advance...


Solution

  • In general, ggplot2 prefers data to be in a "long" format, where it is currently in a "wide" format. In SQL terms, this is a PIVOT, though I find using tidyr::pivot_* and data.table::melt and ::dcast to be much easier to use than in SQL.

    By that, I mean:

    library(dplyr)
    library(tidyr) # just for pivot_longer
    dat <- pivot_longer(dat, cash:mobile) %>%
      mutate(pct = (value / total))
    dat
    # # A tibble: 28 x 5
    #     week  total name   value    pct
    #    <int>  <dbl> <chr>  <dbl>  <dbl>
    #  1     9   3.45 cash    3.45 1     
    #  2     9   3.45 ccard   0    0     
    #  3     9   3.45 fcard   0    0     
    #  4     9   3.45 mobile  0    0     
    #  5    10  55.8  cash   13.0  0.234 
    #  6    10  55.8  ccard  17.2  0.308 
    #  7    10  55.8  fcard   4.32 0.0775
    #  8    10  55.8  mobile 21.2  0.381 
    #  9    11 117.   cash   47.6  0.408 
    # 10    11 117.   ccard  24.5  0.210 
    # # ... with 18 more rows
    

    With that, you can do

    library(ggplot2)
    # library(scales) # percent
    ggplot(dat, aes(week, pct, fill=name)) +
      geom_density(position="fill", stat="identity") +
      scale_y_continuous(labels = scales::percent)
    

    ggplot2 stacked density

    (I should add that the "density" nature of this plot is a little deceiving: there is the suggestion of data in between the weekly points. Since the x-axis is effectively discrete with a low "n", I'd suggest a barplot as @RyanJohn suggests.)