I'm new to r and am trying to put pairs of factors that are side by side in a list into a dataframe so that I can export them as edges for GEPHI. I am trying to create a dataset that acts like a shopping list for each individual users journey where each edge would be a journey from one shopping point to another.
Here is sample data that I am testing on:
a <- c("a","a","a","b","b","a","a","b","a","a","c","d","c")
b <- c(12,22,44,22,33,55,33,66,88,55,33,66,77)
a1 <- data.frame(a,b)
b1 <- tapply(a1$b, a1$a, list)
Which looks like this:
$a
[1] 12 22 44 55 33 88 55
$b
[1] 22 33 66
$c
[1] 33 77
$d
[1] 66
Hence, "$a, $b, $c, $d" would be individual users and the lists within would be there transaction journeys. I want the first row to be "12 22" then second be "22 44"... ect with the last being "33 77".
So far I have created the function called "pairsfunction" and have tried to use lapply with it but it doesn't seem to work.
Here is what I have so far:
pairingfunction <- function(x) {
pairdf <- data.frame()
for (i in 1:(length(x)-1)){
a <- x[i]
b <- x[(i+1)]
pairdf[(nrows(pairdf)+1)] <- a
pairdf[(nrows(pairdf))] <- b
} return(pairdf)
}
lapply(b1, pairingfunction)
If someone could help fix the function or let me know a better way than using lapply that would be fantastic. Thanks
You could leverage the nest()
function from the tidyr
package:
library(tidyr)
library(dplyr)
a <- c("a","a","a","b","b","a","a","b","a","a","c","d","c")
b <- c(12,22,44,22,33,55,33,66,88,55,33,66,77)
df <- data.frame(user = a, touchpoint = b)
df %>% nest(touchpoint)
# user data
# 1 a 12, 22, 44, 55, 33, 88, 55
# 2 b 22, 33, 66
# 3 c 33, 77
# 4 d 66