Search code examples
rdata-visualizationdata-analysis

Marketing Channel Flow Map


I have data on all marketing engagements (links clicked, etc), their 'marketing channel', and their 'engagement position".

Engagement position are the following: first touch [first time they ever engage with us], lead create [when they form-fill and give us enough info], opportunity create [the engagement that happened right before an opportunity was created], and closed won [the engagement that happened right before they signed and purchased].

what i want to do is take these 'paths' through our marketing channel, and create a flow map which will map all the possible marketing paths someone has taken.

the data i have contains ID of the engagement, channel, and position like such:

______________________________
| id  |  channel  | position |
| 1   | direct    | FT       |
| 1   | SEM       | LC       |
| 1   | email     | OC       |
| 1   | video     | CW       |
______________________________

That would be an example of one prospects 'marketing path' and i have a couple hundred thousand of those unique paths. This particular lead would have gone direct > SEM > email > video -- and this would be 1 path.

I'd like to map this out by having the channels be the 'destinations' and the positions determine the order of the movement with the most common path being the boldest (or brightest) and the least common being the least bold (or flattest color)--probably done in ggplot2

I understand this is a bit broad, but i have very very limited experience in visualizing a 'mapping' type of data set, so i dont even know which packages would be useful to me.

I am using R


Solution

  • Here's a try using ggplot. First, make some example data:

    library(tidyverse)
    tbl1 <- tibble(
      id=1:100, 
      channel = sample(c("direct", "SEM", "email", "video"),
                       size=100, replace=TRUE, prob=c(.1,.2,.3,.4)),
      position = "1-FT")
    tbl2 <- tibble(
      id=1:100, 
      channel = sample(c("direct", "SEM", "email", "video"),
                       size=100, replace=TRUE, prob=c(.2,.1,.3,.4)),
      position = "2-LC")
    tbl3 <- tibble(
      id=1:100, 
      channel = sample(c("direct", "SEM", "email", "video"),
                       size=100, replace=TRUE, prob=c(.3,.2,.1,.4)),
      position = "3-OC")
    tbl4 <- tibble(
      id=1:100, 
      channel = sample(c("direct", "SEM", "email", "video"),
                       size=100, replace=TRUE, prob=c(.4, .3,.2,.1)),
      position = "4-CW")
    
    tbl= bind_rows(tbl1, tbl2, tbl3, tbl4)
    

    Then, make an example graph:

    ggplot(tbl, aes(x=position, y=channel, group=id)) +
      geom_line(alpha=.1, size=3)
    

    I think it would be cooler to vary the size by the count; another option would be to use a color scale with the count. Here, I'm using a single alpha value as a hack for a scale.

    enter image description here