I have a large data.frame containing these values:
ID_Path Conversion Lead Path Week
32342 A25177 1 JEFD 2015-25
32528 A25177 1 EUFD 2015-25
25485 A3 1 DTFE 2015-25
32528 Null 0 DDFE 2015-25
23452 A25177 1 JDDD 2015-26
54454 A25177 1 FDFF 2015-27
56848 A2323 1 HDG 2015-27
I want to be able to create a frequency table that displays a table like this:
Week Total A25177 A3 A2323
2015-25 3 2 1 0
2015-26 1 1 0 0
2015-27 2 1 0 1
Where every unique Conversion
has a column, and all the times where the Conversion
is Null is the same time as when the Lead
is 0.
In this example there is 3 unique conversions, sometimes there is 1, sometimes there are 5 or more. So it should not be limited to only 3.
I have created a new DF
containing only Conversion
that are not Null
I have tried using data.table
with this code:
DF[,list(Week=Week,by=Conversion]
with no luck.
I have tried using plyr
with this code:
ddply(DF,~Conversion,summarise,week=week)
with no luck.
I would recommend dropping unnecessary levels in order to not mess the output, and then run a simple table
and addmargins
combination
DF <- droplevels(DF[DF$Conversion != "Null",])
addmargins(table(DF[c("Week", "Conversion")]), 2)
# Conversion
# Week A2323 A25177 A3 Sum
# 2015-25 0 2 1 3
# 2015-26 0 1 0 1
# 2015-27 1 1 0 2
Alternatively, you could do the same with reshape2
while specifying the margins
parameter
library(reshape2)
dcast(DF, Week ~ Conversion, value.var = "Conversion", length, margins = "Conversion")
# Week A2323 A25177 A3 (all)
# 1 2015-25 0 2 1 3
# 2 2015-26 0 1 0 1
# 3 2015-27 1 1 0 2